Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Improving CBIR Using Feature Extraction Based on

Wavelet Transform

Carolina W. Silva, Pedro H. Bugatti, Marcela X. Ribeiro, Caetano Traina Jr., Agma J. M. Traina
Computer Science Department - ICMC-USP
Caixa Postal 688 - 13560-970 - So Carlos - SP, Brasil
{carolina,pbugatti,mxavier,caetano,agma}@icmc.usp.br

ABSTRACT has been proved to be unsatisfactory. New techniques are needed


The gap semantic and the curse of dimensionality are two short- to support high-level queries.
comings of content-based image retrieval techniques that rely on One of the uses of CBIR concerns in biomedical disciplines,
automatic feature extracted from images to process similarity queries. where content-based indexing and retrieval based on information
The first one represents the semantic gap that exists between low- contained in the pixel data of medical images is expected to have a
level features automatically extracted by a computational system, great impact on biomedical image databases [9]. However, existing
and the high-level user interpretation of images. The second one systems are not well-suited to the medical imagery special needs,
involves problems occurring when similarity is defined over high- because several images out of the context are mixed with the rel-
dimensional feature spaces. This paper shows a method that deals evant ones. Thus, novel methodologies are urgently needed. An-
with these both shortcomings. We use discrete wavelet transforms other use is because image collections expand very quickly today
to obtain the image representation from a multiresolution point of with the price of storage going down, faster internet, and off-the-
view. The feature vectors were composed of the features from the shelf price for personal digital cameras.
approximation subspace, which succinctly represent the images in It is important to recognize the shortcomings of CBIR as a real-
the processing of similarity queries. In addition, the multiresolution world technology. One problem with all current approaches is
method was used to reduce the dimensionality of the feature space. the reliance on visual similarity for judging semantic similarity,
This work shows the evaluation of three different image datasets, which may be problematic due to the semantic gap between the
where the first two are composed of medical images and the third low-level content automatically extracted and the higher-level con-
one is a generic image dataset. The results are promising and show cepts sought by the user [6].
an improvement of up to 90% for recall values up to 65%, in the Basically, all systems use the assumption of equivalence of an
query results using the Daubechies wavelet transform. image and its representation in the feature space. These systems
often use measurement systems such as the easily understandable
Euclidean vector space model for measuring distances between a
Categories and Subject Descriptors query image (represented by its features) and possible results, rep-
H.3.m [Information Storage and Retrieval]: Miscellaneous; I.4.7 resenting all images as feature vectors in a n-dimensional vector
[Image Processing and Computer Vision]: Feature Measurement space. Still, the use of high-dimensional feature spaces has shown
feature representation to cause problems. Also, caution should be taken when choosing
the distance measure in order to retrieve meaningful results. These
problems with a similarity definition in high-dimensional feature
Keywords spaces is also known as the curse of dimensionality, and has also
Wavelets, CBIR, Image Processing, Feature Vector, Multiresolu- been discussed in the domain of medical imaging [10].
tion Beyer in [4] proved that the increasing in the number of features
(and consequently the dimensionality of the data) leads to losing
the significance of each feature value. Thus, to avoid decreasing
1. INTRODUCTION the discrimination accuracy, it is important to keep the number of
Content-based image retrieval (CBIR) has been an active research features as low as possible, establishing a trade-off between the
area in the last fifteen years, and a variety of techniques have been discrimination power and the feature vector size.
developed. However, retrieving images based on low-level features Aimed at overriding the problems of the semantic gap and the
curse of dimensionality, this paper shows a simple but power-
This work has been supported by CNPq, FAPESP and CAPES. ful feature extractor based on multiresolution wavelet transform,
which uses the approximation subspace to compose the feature vec-
tor to represent the image. The results of applying our method
achieves 90% regarding the precision in the retrieval of medical
images that asks up to 65% of the image set.
The remainder of this paper is structured as follows. Section 2
presents the main concepts needed to follow the paper. Section 3
presents the proposed method, while Section 4 discusses the exper-
iments and results achieved with the developed method. Finally,
Section 5 presents the conclusions of this work.
2. BACKGROUND
The characteristics (features) extracted from the multimedia data
is one of the key aspects for the similarity comparison between
complex data (e.g. images, videos, sounds, time series, DNA se-
quences, among others). As afore cited, the features are arranged
in a vector named feature vector that can be seen as a n-dimensional
point in a vector space.
(a) (b) (c)
We introduce in the present Section a brief review about the mul-
tiresolution method and wavelet theory. We also describe the tra-
ditional histograms, as well as the distance functions that we will Figure 1: Example of wavelet decomposition (a) original im-
in this paper to quantify the similarity between the feature vectors, age; (b) image decomposed in two steps from Haar wavelets;
and we describe the Statistical Association Rules Miner algorithm, (c) configuration of regions after decomposition
called StARMiner.

2.1 Wavelets The choice of a wavelet basis still represents an open problem for
Wavelets are mathematical functions that separate the signal in filtering [7]. Probably the most popular wavelets are the Daubechies
different components of frequency, and then examine each compo- wavelets, because of their orthogonality and compact support [16].
nent with a combined resolution with its scale. The wavelets have We choose Symlets, Coifman and Daubechies wavelets to explore
a multiresolution property that make it easier to extract the image in this work.
features from transformed coefficients [14]. Figure 1 shows an example of a wavelet decomposition and the
Our proposed technique works on image subspaces generated by configuration of regions after decomposition.
applying wavelet transforms through the multiresolution method.
The central element of a multiresolution analysis is a function
2.2 Traditional Gray-Level Histogram
(t), called the scaling function, whose role is to represent a signal One of the most common technique used to represent an image
at different scales. The translations of the scaling function con- regarding to gray-level (color) content is the traditional histogram.
stitute the building blocks of the representation of a signal at a It gives the frequency of occurrences of a specific gray-level ob-
given scale. The scale can be increased by dilating (stretching) the tained from the pixels of the image. Its omni-presence is mostly
scaling function or decreased by contracting it [14]. due to its nice properties of linear cost to be obtained, as well as
The scaling function (t) acts as a sampling function (a basis), its invariance on rotation, translation and scale, for normalized his-
in the sense that the inner product of (t) with a signal represents tograms. It can also be used as a first step on selecting the most
a sort of average value of the signal over the support (extent) of . relevant images for a query, thus reducing the candidate set, be-
A recursive application of this process generates new nested spaces fore applying a more costly feature extractor to compare the images
V j , i.e., ...V 2 V 1 V 0 V 1 ..., which are the basis of [10].
the multiresolution analysis.
By definition, a signal in V 1 can be expressed as a superposi-
2.3 Distance Functions
tion of translations of the function 1 , but because the space V 0 is The efficiency and efficacy of multimedia data retrieval will be
included in V 1 , any function in V 0 can also be expanded in terms significantly affected by the inherent ability of the distance function
of the translations of (t). In particular, this is true for the scaling on separate the data [5]. Thus, we briefly describe some of the
function itself. distance functions that we evaluated in this paper.
Consequently, there must exist a sequence of numbers h = {h0 ,
2.3.1 Minkowski Family

X
h1 , . . .} such that the following relationship is satisfied :
The most widely used distance function is the Minkowski family
n (or Lp norm) [17], which is employed to vector spaces. In a vec-
0 (t) = hn 1 (t ) (1)

vuuX
2 tor space the objects are identified with n real-valued coordinates
n
{x1 ,..., xn }. Thus the Lp distances are defined as:

t
Equation 1 is very important and it is known as the scaling equa-
n
tion. Equation 1 describes how the scaling function can be gener-
Lp ((x1 , ..., xn ), (y1 , ..., yn )) = p
|xi yi |p (3)
ated by superposing compressed copies of itself. Now it is possible
i=1
to define a new space W j as the orthogonal complement of V j in
V j+1 . In other words, W j is the space of all functions in W j that The well-known Euclidean distance corresponds to L2 . Accord-
are orthogonal to all functions in V j under the chosen inner prod- ing to the value assigned to p we obtain the Lp family variations.
uct. The relationship to wavelets is in the fact that the spaces W m The L2 (Euclidean) distance is commonly used to calculate the
are spanned by dilation and translation of a function (t), thus, distance between vectors, and corresponds to the human-beings no-
such collection of basis functions are called wavelets. tion of distance. This distance is additive, in the sense that each

vuuX
As in the case with the scaling function, since the wavelet (t) feature contribute independently to the measure of distance and it
belongs to V 1 , it can be expressed as a linear combination of (t) is formally defined as:

(t) =
X
at scale m = 1, which can be written as:

gn 1 (t n) (2)
dL2 (X, Y ) = t k
(xi yi )2 (4)
i=1
n

where the sequence g is called the wavelet sequence. In the lit- 2.3.2 Statistic Value 2
erature, h and g are known as the low and high frequency filters This distance function is calculated as the difference between
respectively. each observed and theoretic frequency for each possible result, ris-
kNNqueryrequest
withtheappropriate
distancefunction
Indexing
Image
Indexed
Database
feature Queryresults
vectors
User

Figure 2: Proposed method of feature extraction using 4 levels of decomposition. Each pixel value of the approximation subspace is
assigned to the feature vector

ing it to square, and dividing each one by the theoretic frequency. fi in the remaining dataset; max - the maximum standard devia-
Finally, the 2 is defined as: tion of fi values allowed in a given class and; min - the minimum

d2 (X, Y ) =
Xn
(yi mi )2
(5)
confidence to reject the hypothesis H0. StARMiner mines rules of
the form xj fi , if the conditions given in Equations 6, 7 and 8
are satisfied.
i=1
mi

where mi = yi +x i
. This distance function emphasizes the el- fi (Txj ) fi (T Txj ) min (6)
2
evated discrepancies between two feature vectors X and Y com-
pared, and measures how improbable the distribution is. fi (Txj ) max (7)

H0 : fi (Txj ) = fi (T Txj ) (8)


2.4 Association Rules
In this work, associatin rules are employed to perform feature In Equation 8, H0 should be rejected with a confidence equal
selection and, consequently, dimensionality reduction of the fea- to or greater than min , in favor of the hypothesis that the means
ture vectors. Mining of association rules [1] is one of the most fi (Txj ) and f (T Txj ) are statistically different. A rule xj
investigated areas in data mining. Currently, finding associations fi , returned by the algorithm, relates a feature fi with a class xj ,
has been widely used in many applications such as customer cate- where values of fi have a statistically different behavior in images
gorization, data classification and summarization [11, 12]. Mining of class xj . This property indicates that feature fi can distinguish
images demands to extract their main features regarding specific images of class xj from the remaining ones.
criteria. After extracted, the feature vector and image descriptions The StARMiner algorithm associates classes and features with
are submitted to the mining process. the highest power to distinguish the images. The features returned
The process of dimensionality reduction (also called dimension in the rules have a particular and uniform behavior in images of a
reduction) aims at reducing the number of features (attributes or given category. This is important because the features presenting
dimensions) used to represent a dataset under consideration. Thus, uniform behavior to every image in the dataset, independently of
to achieve the dimensionality reduction of feature vectors can be the image category, do not contribute to categorize them and should
employed an algorithm named StARMiner [13], to mine statistical be eliminated from the feature vector. Therefore, the StARMiner
association rules from features of a training dataset, which is briefly rules are very useful to reveal the relevance of the image features.
described in the following section.
3. THE PROPOSED METHOD
2.4.1 Statistical Association Rule Mining Our method was built aimed at dealing with two inherent draw-
The essence of this approach is the concept of association rule backs of a CBIR system: the high dimensionality of feature vectors
mining. Traditionally, the problem of mining association rules con- and the semantic gap. We amend the first one by applying higher
sists to find relationships of the form A B, where A and B are resolution on the multiresolution technique, and the second one by
set of items, indicating that A and B frequently occurs together in characterizing images through the feature vectors composed of the
the database transactions, and that, if A occurs there is a high prob- approximation subspace, which is obtained through a convolution
ably that B also occurs. These type of rules work well when deal- over each image by the wavelet filters. We choose the following
ing with categorical (nominal) data items. However, when dealing wavelet filters: Coifman (coif 1 and coif 2), Symlet (sym2, sym3,
with image features that consists of continuous attributes, a type of sym4, sym5 and sym15) and Daubechies (db1, db2, db3, db4 and
association rule that considers continuous values is necessary. A db8). The wavelet mnemonics used in this work are the same ones
recent type of continuous association rules is the statistical asso- employed in the Matlab 6.5 tool.
ciation rules, which are rules generated using statistical measure- We use 4, 5 and 6 levels of resolution and the approximation
ments. subspace is represented by reading it column to column, assign-
Let T be a dataset (for instance of medical images), xj an im- ing the values obtained to the feature vector. Thus, it is important
age class, Txj T the subset of images of class xj and fi the ith to highlight that in our method the feature vector is composed by
feature of the feature vector F . Let fi (Z) and fi (Z) be, respec- the image, however in a reduced scale. Hence, this methodology
tively, the mean and standard deviation of the values of feature fi in establishes a new approach. Equation 9 gives the formula to calcu-
the subset of images Z. The algorithm uses three thresholds defined late the number of elements of the proposed feature vector, where
by the user: min - the minimum allowed difference between the width is the image width, height is the image height and n is the
average of the feature fi in images from class xj and the average of level of decomposition.
a common configuration employed to perform a large number of
width height tests over the algorithm under evaluation.
#f eatures = , (9) In order to accelerate the similarity queries processing, we used
2n 2n
the Slim-tree [15] as the indexing structure for our prototype, which
For instance, if an image has 256 256 pixels and 4 levels of is a metric access method (MAM) specially developed to minimize
resolution are applied, the vector has 256 features; when 5 levels disk accesses, making the whole system faster. Furthermore, we
are applied, the vector has 64 features. And when 6 levels are used, experiment several metrics (see Section 2.3) to perform the k-NN
the vector has 16 features, which is the total of pixels from the queries.
approximation subspace.
As different distance functions can separate better specific fea- 4.1 Experiment 1 - The 210 Images Dataset
ture vectors, it is important to access the most suited ones to each This dataset consists of 210 medical images classified in seven
type of features from the image databases. Thus, we combine the categories: Angiogram, MR (Magnetic Resonance) Axial Pelvis,
feature vector proposed with the distance function appropriate for MR Axial Head, MR Coronal Abdomen, MR Coronal Head, MR
the calculated features. Sagittal Head and MR Sagittal Spine. Each category is represented
Figure 2 graphically summarizes the proposed method and the by 30 images.
prototype developed, which is describe in the Section 4. First, we compare the method by using 4 levels of resolution
and Coifman (coif 1 and coif 2) , Daubechies (db2 and db8) and
4. EXPERIMENTS AND RESULTS Symlet (sym2, sym3, sym4, sym5 and sym15) wavelets. The
In order to evaluate the effectiveness of the proposed technique use of each one of these wavelet transforms generate a vector with
we worked on a variety of medical images categories and with the 256 features. The Precision vs. Recall curves of the nine proposed
Amsterdam Library of Object Images (ALOI) [8]. As an efficacy vectors are shown in Figure 4. Each point of the graph is obtained
measure we also have generated graphs based on the approach of by the average of 210 queries. To perform the present experiments
Precision vs. Recall [2], obtained from the results of sets of simi- we employed the L2 distance function.
larity query. Analyzing the graphs of Figure 4, we notice that the wavelet
Recall indicates the proportion of relevant images in the database that best represents the images is the Daubechies db2. We can also
that has been retrieved when answering a query, and Precision is the consider that the curve generated by coif 1 wavelet practly ties to
portion of the retrieved images that are relevant for the query. As a db2.
rule of thumb, the closer a Precision vs. Recall curve to the top of 1
the graph, the better the technique. They are formally defined as:
0.9
|RA | |RA |
Precision

P recision = Recall = (10) 0.8


|R| |A|
where, RA is the number of retrieved and relevant images; R is the 0.7
number of all images retrieved by the query; A is the number of
relevant images in the dataset. 0.6

Query Image 0.5


Thumbnails of the 10 nearest neighbors
0 0.2 0.4 0.6 0.8 1
Recall
coif1 db8 sym4
coif2 sym2 sym5
db2 sym3 sym15
0037.jpg
Figure 4: Precision vs. Recall curves showing the retrieval be-
havior of the proposed method with 4 resolution levels using
Figure 3: Example of a 10-nearest neighbor query performed 256 features with the L2 metric
by the developed prototype over an image database of 704 im-
ages For the graphic of Figure 5, we compare the Coifman (coif 1 and
coif 2) and Daubechies (db1, db2 and db8) wavelet transforms in 5
To build the Precision vs. Recall graphs, we applied sets of k- resolution levels. Thus, their feature vectors also have 64 features.
nearest neighbor (k-NN) queries. A k-NN query consists in search- Figure 5 shows the Precision vs. Recall curves from the queries on
1 ing the dataset for the k closest images to the query image, com- the dataset represented by these feature vectors.
paring their feature vectors by using a defined distance function to Note that the Haar (or db1) wavelet is the best one to represent
0.8 compute how similar the images are. Using the proposed method, these images. As a basis for comparison, the graph in Figure 7 also
we developed a prototype to process k-NN queries. Figure 3 shows presents the average Precision vs. Recall curve obtained by using
0.6 an example of a k-NN query performed by our prototype. gray-level histograms over the same image dataset. The results for
For our experiments on medical image database, each Precision Haar (or db1) give the best Precision vs. Recall curves shown until
vs. Recall curve represents the average curve of all the curves ob- now. We can see that all the proposed methods have better Preci-
0.4 tained by performing a k-NN query for each image in the whole sion vs. Recall curves than Histogram. The curve with the highest
image set, and for the ALOI database, we use just 20% of the data- precision is the one generated by db1, with just 64 features, while
0.2 set to perform the k-NN query, due to the database size. This is the other methods use 256 features, i.e., there is a reduction of 75%

0
0 0.2 0.4 0.6 0.8 1
Recall

30 features 16 features
Recall Recall
coif1-4n db1-5n coif1 db8 sym4
db2-4n histogram coif2 sym2 sym5
db2 sym3 sym15
1 1 1

0.9 0.9 0.9

0.8 0.8 0.8

Precision
Precision

0.7

Precision
0.7 0.7

0.6 L1 0.6 L1 0.6 L1


L2 L2 L2
0.5 Linf 0.5 Linf 0.5 Linf
Chi Chi Chi
0.4 Jeffrey 0.4 Jeffrey 0.4 Jeffrey
Canberra Recall Canberra Recall Canberra
Recall
0.3 0.3
0 0.2 0.4
1 0.6 0.8 1
0.3
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
(a) (b) (c)
0.9
Figure 6: Precision vs. Recall graphs illustrating the comparison among several distance functions using Daubechies db1 wavelet

Precision
0.8 of resolution
with: (a) 4 levels of resolution; (b) 5 levels of resolution; and (c) 6 levels

1 0.7
Feature Extractor - Texture
L2
1
Feature Extractor - Texture
Linf
1 L2 Starminer
These methods are well-suited to represent the
Weighted L2 k=0
Weighted L2 k=1
images
Linf Starminer
Weighted Linf k=0
Weighted Linf k=1
under
0.8 0.6evaluation, since the precision
0.8 values are over than 80% for all re-
0.9 call values smaller than 65%. It is important to emphasize that the
0.5regions of low recall 0.6are the most important into a CBIR system
Precision (%)

Precision (%)
0.6
Feature Extractor - Texture
0.8
Precision

1
L1 because k-nearest queries usually dont search for high values of k.
L1 Starminer
Weighted L1 k=0
Weighted L1 k=1
0.4 0Thus, by0.2
comparing0.4
0.4
the results0.6provided0.8by the feature 1 vectors
0.8
0.7 with 256 and 64 elements, Recall
where the smaller one brings a more
0.2 0.2
accurate image set, we can conclude that the dimensionality curse
Precision (%)

0.6
0.6 0 coif1
really damages the results.
0
db8
This happens because sym4 the irrelevant fea-
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
0.4
Recall (%) coif2
tures sym2
disturb the influence of the relevantRecall sym5
ones.
(%) Moreover, the appli-
0.5 db2of wavelet transform
cation sym3 in 5 levels through sym15 the multiresolution
0.2 method reduced the redundancy of information from data, and it
0 0.2 0.4 0.6 0.8 1 also well represents the images for executing similarity queries.
0
0 0.2 0.4 0.6 0.8 1 Recall
Recall (%)
4.2 Experiment 2 - The 704 Image Dataset
coif1 db1 db8
A larger image dataset, with 704 MR images, which is classified
coif2 db2
in eight categories was used herein. The number of the images in
Figure 5: Precision vs. Recall curves showing the retrieval be- the dataset regarding each category is: Angiogram (36), MR Axial
havior of the proposed method with 5 resolution levels using 64 Pelvis (86), MR Axial Head (155), MR Sagittal Head (258), MR
features with the L2 metric Coronal Abdomen (23), MR Sagittal Spine (59), MR Axial Ab-
domen (51) and MR Coronal Head (36). Figure 8 shows one image
from each category. As in the previous experiment the best result
in data dimensionality. The queries performed by using the db1 was obtained applying a Daubechies wavelet transform, and, ac-
feature vectors with 5 level of resolution gives precision levels up cording to Wang [16], the Daubechies wavelet achieves excellent
to 82,55% regarding the images histogram, to queries that ask up results in image processing due to its properties. Then we use sev-
to 90% of the images. eral wavelets of the family of Daubechies on 4, 5 and 6 levels of
resolution by the multiresolution method.
1

0.8
Precision

(a) (b) (c) (d) (e) (f) (g) (h)


0.6
Figure 8: Examples of images from the dataset. (a) Angiogram,
(b) MR Axial Pelvis, (c) MR Axial Head, (d) MR Axial Ab-
0.4
domen, (e) MR Coronal Abdomen, (f) MR Coronal Head, (g)
MR Sagittal Head, and (h) MR Sagittal Spine
0.2
We evaluated several distance functions such as L1 , L2 , L ,
0 Jeffrey Divergence, Canberra and 2 . The L1 , L2 and 2 metrics
0 0.2 0.4 0.6 0.8 1
presented the best results. We can observe that the curves of Preci-
Recall sion vs. Recall obtained by these distance functions are equivalent,
as we can see in Figure 6. Due to this fact we employed the Eu-
coif1-4n db1-5n clidean distance function to perform the experiments.
db2-4n histogram Figure 9 shows the Precision vs. Recall curves generated by the
Figure 7: Precision vs. Recall curves showing the retrieval be- proposed method. First, the wavelet name is displayed, then the
havior of the best curves and gray-level histograms with the L2 level of resolution and finally the number of elements of the feature
metric vector. Observe that the Precision vs. Recall curves generated by
.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Recall Recall
db8 coif1-4n db1-5n coif1 db8 sym4
db2-4n histogram coif2 sym2 sym5
db21 sym3 sym15
the same wavelet transform in several levels of resolution decrease
according to the wavelet chosen. The bigger the number of the fil-
ters, the fast the curves decrease when a higher level of resolution is 0.8
chosen. Analyzing the db1 wavelet, which has two filters, note that

Precision
the curves generated in 4 and 5 levels of resolution are equivalent, 0.6
even exiting a large difference between the number of features from
their vectors. For a 6 levels of resolution we still have an excellent
result (see db1-6n-16 curve), as with just 16 features, the precision 0.4
is over than 80% for values of recall up to 90%. And comparing
the 256 features with the 16 ones, we have a dimensionality reduc- 0.2
tion of 93.75%. Also, note that the larger the number of filters, the
smaller the precision of the queries, considering the same level of 0
resolution. These results suggest that relevant features are removed 0 0.2 0.4 0.6 0.8 1
from the image when we use filters that are greater than 2.
Recall
1 db1-4n db1-6n
db1-5n histogram
0.8
Figure 10: Precision vs. Recall graphs showing the retrieval
behavior of the best curves and gray-level histogram with L2
Precision

0.6 metric

0.4 D1 x01 y01 m1 1 b1 ... DL x0L y0L mL L bL

0.2
Featuresofthetextureclass1 FeaturesofthetextureclassL

0 Figure 11: The feature vector


0 0.2 0.4 0.6 0.8 1
Recall
db1-4n-256 db3-4n-256 a reduced feature vector, that in this case has selected 16 features.
db1-5n-64 db3-5n-64 Thereafter the reduced feature vectors were used to index the im-
db1-6n-16 db4-4n-256 ages in a CBIR environment and to1answer similarity queries.
db2-4n-256 db4-5n-64 Figure 12 shows the comparison of the curves generated by our
db2-5n-64 db8-4n-256 method with 16 features (db1-6n-16 curve) to the method that uses
db2-6n-16 db8-5n-64 an improved version of EM/MPM0.8 algorithm and to the method that
uses the data mining algorithm StARMiner over the feature vec-
Precision

Figure 9: Precision vs. Recall graphs generated by several 0.6 of EM/MPM, which is one
tors extracted by the improved version
Daubechies wavelets transforms in 4, 5 and 6 levels of multires- of the best method in the literature. We can see that our method
olution with L2 metric performs better when processing0.4 similarity queries (k-NN). Note
also that, our method demands fewer features than the EM/MPM.
Figure 10 shows the best curves of Precision vs. Recall with 256, We can also compare the time spending to process the image and
64 and 16 features, respectively, from the Figure 9 and compare to extract the features. While the0.2improved version of EM/MPM
them with the curve given by the gray-level histogram. It is clear spends around 17.05 seconds per image, our method with 16 fea-
to see that all three methods have a better performance than the 0
tures spends around 0.77 seconds. Even selecting the most relevant
histogram. Numerically, we get an improvement of precision up features by using StARMiner, our method0 0.2 better0.4
performs than the 0.6
to 531% to values of recall up to 95%, for the feature vector with improved version of EM/MPM with 16 features. Recall
256 features. For 64 features, the improvement in precision is up
to 528% to a recall of 95%; and for 16 features, the improvement 4.3 Experiment 3 - The ALOI Image Dataset
30 features 16 fea
1 of 90%.
in precision is up to 491% to a recall The third image dataset employed in the experiments is the Am-
To compare this method with another one from literature, we sterdam Library of Object Images (ALOI) [8], a gray image collec-
used a technique proposed by Balan 0.8 [3], which employs an im- tion of one-thousand small objects, recorded for scientific purposes
proved version of the EM/MPM method to segment images, and in several configurations. The ALOI-ILL dataset consists of 24,000
for each region segmented based on texture, six features were ex- images in gray-scale recorded under varying 24 different illumina-
Precision

tracted: the mass (m); the centroid0.6(xo and yo); the average gray tion angle of each object. The images are represented by 8 bits,
level (), the Fractal dimension (D); and the linear coefficient used resulting in 256 gray-levels and comprising dimensions of 384 x
to estimate D (b). Therefore, when 0.4 an image is segmented in L 288 pixels. Figure 13 shows a sample of 10 elements of this data-
classes, the feature vector has L * 6 elements. Here we use L = 5, set. As in the previous experiments the best results were obtained
so the feature vector has 30 features.
0.2 Figure 11 illustrates the fea- using the db1 wavelet, we applied our method with the same one.
ture vector described. In this experiment, the Precision vs. Recall curves are built as
Over these image feature vectors was executed the data mining the average plots of 240-nearest neighbor queries, using 20% of
0
algorithm StARMiner, which was described in Section 2.4.1. The the dataset. To evaluate the correctness of each image retrieved, we
0.4 0.6 0.8 1
rules mined identify the most relevant 0 image features,
0.2 0.4
generating 0.6 used 0.8 1
the object portrayed in the image as a class attribute, therefore
Recall Recall

db1-6n 30 features 16 features


histogram
0.6

Pr
Precision
0.4

0.4
0.2

0.2
1
1 0
0
0
0 0.2 0.4 0.6 0.8 1 0.8
0.8 Recall
5n - 108 features 6n - 30 features
0.6

Precision
0.6 5n - 60 features SM 6n - 20 features SM
Precision

0.4
0.4
1
0.2 0.9
0.2
0.8 1 30 features (EM/MPM)

Precision
15 features (StARMiner) 0.8
16 features (db16n) 0
56 0 0 0.2 0.4 0.6 0.8 0.7 1
0 0.2 0.4 0.6 0.8 1 Recall
64
0.6
56 Recall
64 6n - 27 features 5n - 108 features Histogram
0.5the ALOI-
Figure 15: Precision vs. Recall graphs obtained over
56 Figure 12: Precision vs. Recall graphs from db1-6n with 16
64 features and from improved version of the EM/MPM ILL dataset for each category employing: db1 with0 5 and0.26 0.4
1 levels of decomposition with 2 and the StARMiner feature se- Rec
lection
coif1 db8
0.8 coif2 sym2
db2
built by executing similarity queries employing: (1) the proposed sym3
0.6 feature vector comprising 27 features, using db1 with 6 levels of
decomposition; (2) the proposed feature vector comprising 108 fea-
0.4 tures, using db1 with 5 levels of decomposition and (3) the features
obtained by using the traditional gray-level histograms, each vector
0.2 comprising 256 features, since the images are represented by 8 bits.
Figure 13: Examples of images from the dataset Thus, analysing the graphs of Figure 15, we can notice that both
0 instances of the proposed feature vectors (with 27 features and
0 0.2 0.4 0.6 0.8 1
108 features) presented a considerable improvement in precision,
performing a supervised automated evaluation of the Recall algorithm. up to 75% and 125% for a recall level of 45% and 50% respec-
For each query, the images considered for the computation of pre- tively in comparison with the precision achieved by the traditional
cision are those related to the same classdb1-4n
in the dataset. The plots db1-6n gray-level histogram. Hence, these results testify that the proposed
db1-5n histogram
show the average result for all images of the dataset as query im- feature vector improves the precision of similarity queries. In such
ages. case, we can argue that our proposed feature vector is well-suited to
We evaluated the L1 , L2 , L , Jeffrey Divergence, Canberra and content-based medical images, as well as to a generic image dataset
2 distance functions. The best results were obtained using the 2 like the ALOI-ILL. In addition to the improvement of precision, it
distance function, as we can notice in Figure 14. is important to highlight also that the method performs a notable
dimensionality reduction in comparison with the dimensionality of
1
the traditional gray-level histograms (e.g. 256 dimensions), up to
90% and 58% considering respectively the db1 with 6 levels of de-
0.8 composition (i.e. Figure 15 curve 6n-27 features) and with 5 levels
(i.e. Figure 15 curve 5n-108 features).
0.6 In Figure 16 is illustrated the Precision vs. Recall graphs ob-
Precision

tained by using the proposed feature vector comprising 27 features


0.4
(i.e. db1 with 6 levels of decomposition), 108 features (db1 with
L1 5 levels of decomposition) and also the features selected by the
L2
Linf StARMiner algorithm. When analysing these graphs we can ob-
0.2 Chi serve that again the proposed feature vector clearly improves the
Jeffrey precision of similarity queries, but it is remarkable to note that the
Canberra
0
0 0.2 0.4 0.6 0.8 1
precision obtained by the StARMiner almost ties with the preci-
Recall sion obtained by our original feature vector. It is important to high-
light that although the precisions tie, combining the StARMiner to
Figure 14: Precision vs. Recall graphs illustrating the com-
our proposed feature vector provides a dimensionality reduction of
parison among several distance functions using Daubechies db1
aproximately 26% and 45% on the feature vectors size, considering
wavelet with 6 levels of resolution
the vectors that comprise 27 and 108 features respectively. Hence,
demanding less space and making the processing faster, and at the
The Precision vs. Recall graphs of Figures 15 and 16 correspond
same time preserving the precision of the similarity queries.
to the experiments performed on the ALOI-ILL image dataset rep-
resented by the proposed feature vector described in Section 3. The
results were obtained using the 2 distance function since it demon- 5. CONCLUSIONS
strates the better precision gain in comparison with the traditional In this paper we presented a technique based on wavelet approx-
ones (e.g. Minkowski family). imation subspaces, which was used to compose the image feature
The Precision vs. Recall curves in the graphs of Figure 15 were vector to process similarity queries on the image content. A tool

0.8
cision

0.6
1 [2] R. A. Baeza-Yates and B. A. Ribeiro-Neto. Modern
Information Retrieval. Addison-Wesley, Wokingham, UK,
1
1999.
0.8
s [3] A. G. R. Balan, A. J. M. Traina, C. Traina Jr., and P. M. d. A.
M Marques. Fractal analysis of image textures for indexing and
Precision

0.6 retrieval by content. In 18th IEEE Intl. Symposium on


Computer-Based Medical Systems - CBMS, pages 581586,
0.4 Dublin, Ireland, 2005. IEEE Computer Society.
[4] K. Beyer, J. Godstein, R. Ramakrishnan, and U. Shaft. When
0.2 is nearest neighbor meaningful? In C. Beeri and
P. Buneman, editors, ICDT, volume 1540 of Lecture Notes in
Computer Science, pages 217235, Jerusalem, Israel, 1999.
0
0 0.2 0.4 0.6 0.8 1 Springer Verlag.
Recall [5] P. H. Buggatti, A. J. M. Traina, and C. T. Jr. Assessing the
6n - 30 features 5n - 108 features best integration between distance-function and image-feature
6n - 20 features StaRMiner 5n - 60 features StaRMiner to answer similarity queries. In 23rd Annual ACM
Figure 16: Precision vs. Recall graphs obtained over the ALOI- Symposium on Applied Computing, pages 12251230,
ILL dataset for each category employing: db1 with 5 and 6 Fortaleza, CE, Brazil, 2008.
levels of resolution with 2 metric and the StARMiner feature [6] T. M. Deserno, S. Antani, and L. R. Long. Gaps in
selection content-based image retrieval. In S. C. Horii and K. P.
Andriole, editors, Medical Imaging 2007: PACS and
Imaging Informatics, volume 6516 of SPIE, 2007.
based on the presented technique was implemented, first aiming [7] C. Garcia, G. Zikos, and G. Tziritas. Wavelet packet analysis
at validating the technique proposed on real images from different for face recognition. Image and Vision Computing,
tissues of the human body, and to assist the study and analysis of 18(4):289297, 2000.
medical images, aiming to be included in a PACS under develop-
0.4 0.6 ment 0.8in our 1institution; second aiming at validating it on a generic [8] J.-M. Geusebroek, G. J. Burghouts, and A. W. M. Smeulders.
Recall
The Amsterdam library of object images. Journal of
image dataset, that serves to scientific proposes.
Computer Vision, 61(1):103112, 2005.
First, several wavelets were evaluated to the medical image data-
db1-6n
set, and the Daubechies showed a better efficacy than the other
histogram
[9] S. K. Kinoshita, P. M. d. A. Marques, R. R. P. Jr., J. A. H.
ones for the analyzed image sets. The achieved results showed Rodrigues, and R. M. Rangayyan. Content-based retrieval of
that the proposed method performs very well, presenting an image mammograms using visual features related to breast density
retrieval accuracy always over 90% for recall values smaller than patterns. Journal of Digital Imaging, 20(2):172190, 2007.
65%. Moreover, we obtained a feature vector with just 16 elements [10] H. Mller, N. Michoux, D. Bandon, and A. Geissbuhler. A
that provided a better performance than the vector with 30 features review of content-based image retrieval systems in medical
obtained from segmented images by using an improvement version applications-clinical benefits and future directions.
of EM/MPM algorithm, which is a much more time consuming International Journal of Medical Informatics, 73(1):123,
method. 2004.
Second, as in the previously experiment we got better results ap- [11] C. Ordonez, N. Ezquerra, and C. A. Santana. Constraining
plying the filters of Daubechies, we applied our method using these and summarizing association rules in medical data.
filters on the ALOI-ILL, which is a generic image database and Knowledge and Information Systems, 9(3):259283, 2006.
larger than the other one. We could notice that the curves of Pre- [12] H. Pan, J. Li, and Z. Wei. Mining interesting association rules
cision vs. Recall were not equivalents. We had best results apply- in medical images. In Advance Data Mining and Medical
ing our technique using four levels of decomposition. In addition, Applications, pages 598 609. Springer Verlag, 2005.
combining the proposed method with StARMiner we improved the [13] M. X. Ribeiro, A. G. R. Balan, J. C. Felipe, A. J. M. Traina,
queries precision. Thus, it can be inferred that for this database and C. Traina Jr. Mining statistical association rules to select
our method generated a feature vector that still had some redun- the most relevant medical image features. In IEEE MCD05,
dant features when was used five levels of resolution (108 features). pages 9198, Houston, USA, 2005. IEEE Computer Society.
Then, we moved on to reduce the dimensionality of the feature vec- [14] E. J. Stollnitz, T. D. DeRose, and D. H. Salesin. Wavelets for
tor using six levels of resolution (27 features). But, this procedure Computer Graphics - Theory and Applications. Morgan
leaded to lose some significant features. That is why our curves of Kaufmann Publishers, Inc, San Francisco, CA, 1996.
Precision vs. Recall decreased. In this case, the combination of [15] C. Traina Jr., A. J. M. Traina, B. Seeger, and C. Faloutsos.
our method and StARMiner provided a better representation of the Slim-trees: High performance metric trees miniminzing
images. overlap between nodes. Research Paper CMU-CS-99-170,
By the results obtained in the work, we can claim the wavelets Carnegie Mellon University - School of Computer Science,
and the multiresolution method are well suited to deal with the issue October 1999 1999.
of the semantic gap and the dimensionality of feature vectors. [16] J. Z. Wang. Wavelets and imaging informatics: A review of
the literature. Journal of Biomedical Informatics,
6. REFERENCES 34:129141, 2001.
[1] R. Agrawal and R. Srikant. Fast algorithms for mining [17] D. R. Wilson and T. R. Martinez. Improved heterogeneous
association rules. In International Conference on Very Large distance functions. Journal of Artificial Intelligence
Databases (VLDB), pages 487499, Santiago de Chile, Research, 6:134, 1997.
Chile, 1994.

You might also like