Professional Documents
Culture Documents
WebMedia CarolinaWatanabe Et Al
WebMedia CarolinaWatanabe Et Al
Wavelet Transform
Carolina W. Silva, Pedro H. Bugatti, Marcela X. Ribeiro, Caetano Traina Jr., Agma J. M. Traina
Computer Science Department - ICMC-USP
Caixa Postal 688 - 13560-970 - So Carlos - SP, Brasil
{carolina,pbugatti,mxavier,caetano,agma}@icmc.usp.br
2.1 Wavelets The choice of a wavelet basis still represents an open problem for
Wavelets are mathematical functions that separate the signal in filtering [7]. Probably the most popular wavelets are the Daubechies
different components of frequency, and then examine each compo- wavelets, because of their orthogonality and compact support [16].
nent with a combined resolution with its scale. The wavelets have We choose Symlets, Coifman and Daubechies wavelets to explore
a multiresolution property that make it easier to extract the image in this work.
features from transformed coefficients [14]. Figure 1 shows an example of a wavelet decomposition and the
Our proposed technique works on image subspaces generated by configuration of regions after decomposition.
applying wavelet transforms through the multiresolution method.
The central element of a multiresolution analysis is a function
2.2 Traditional Gray-Level Histogram
(t), called the scaling function, whose role is to represent a signal One of the most common technique used to represent an image
at different scales. The translations of the scaling function con- regarding to gray-level (color) content is the traditional histogram.
stitute the building blocks of the representation of a signal at a It gives the frequency of occurrences of a specific gray-level ob-
given scale. The scale can be increased by dilating (stretching) the tained from the pixels of the image. Its omni-presence is mostly
scaling function or decreased by contracting it [14]. due to its nice properties of linear cost to be obtained, as well as
The scaling function (t) acts as a sampling function (a basis), its invariance on rotation, translation and scale, for normalized his-
in the sense that the inner product of (t) with a signal represents tograms. It can also be used as a first step on selecting the most
a sort of average value of the signal over the support (extent) of . relevant images for a query, thus reducing the candidate set, be-
A recursive application of this process generates new nested spaces fore applying a more costly feature extractor to compare the images
V j , i.e., ...V 2 V 1 V 0 V 1 ..., which are the basis of [10].
the multiresolution analysis.
By definition, a signal in V 1 can be expressed as a superposi-
2.3 Distance Functions
tion of translations of the function 1 , but because the space V 0 is The efficiency and efficacy of multimedia data retrieval will be
included in V 1 , any function in V 0 can also be expanded in terms significantly affected by the inherent ability of the distance function
of the translations of (t). In particular, this is true for the scaling on separate the data [5]. Thus, we briefly describe some of the
function itself. distance functions that we evaluated in this paper.
Consequently, there must exist a sequence of numbers h = {h0 ,
2.3.1 Minkowski Family
X
h1 , . . .} such that the following relationship is satisfied :
The most widely used distance function is the Minkowski family
n (or Lp norm) [17], which is employed to vector spaces. In a vec-
0 (t) = hn 1 (t ) (1)
vuuX
2 tor space the objects are identified with n real-valued coordinates
n
{x1 ,..., xn }. Thus the Lp distances are defined as:
t
Equation 1 is very important and it is known as the scaling equa-
n
tion. Equation 1 describes how the scaling function can be gener-
Lp ((x1 , ..., xn ), (y1 , ..., yn )) = p
|xi yi |p (3)
ated by superposing compressed copies of itself. Now it is possible
i=1
to define a new space W j as the orthogonal complement of V j in
V j+1 . In other words, W j is the space of all functions in W j that The well-known Euclidean distance corresponds to L2 . Accord-
are orthogonal to all functions in V j under the chosen inner prod- ing to the value assigned to p we obtain the Lp family variations.
uct. The relationship to wavelets is in the fact that the spaces W m The L2 (Euclidean) distance is commonly used to calculate the
are spanned by dilation and translation of a function (t), thus, distance between vectors, and corresponds to the human-beings no-
such collection of basis functions are called wavelets. tion of distance. This distance is additive, in the sense that each
vuuX
As in the case with the scaling function, since the wavelet (t) feature contribute independently to the measure of distance and it
belongs to V 1 , it can be expressed as a linear combination of (t) is formally defined as:
(t) =
X
at scale m = 1, which can be written as:
gn 1 (t n) (2)
dL2 (X, Y ) = t k
(xi yi )2 (4)
i=1
n
where the sequence g is called the wavelet sequence. In the lit- 2.3.2 Statistic Value 2
erature, h and g are known as the low and high frequency filters This distance function is calculated as the difference between
respectively. each observed and theoretic frequency for each possible result, ris-
kNNqueryrequest
withtheappropriate
distancefunction
Indexing
Image
Indexed
Database
feature Queryresults
vectors
User
Figure 2: Proposed method of feature extraction using 4 levels of decomposition. Each pixel value of the approximation subspace is
assigned to the feature vector
ing it to square, and dividing each one by the theoretic frequency. fi in the remaining dataset; max - the maximum standard devia-
Finally, the 2 is defined as: tion of fi values allowed in a given class and; min - the minimum
d2 (X, Y ) =
Xn
(yi mi )2
(5)
confidence to reject the hypothesis H0. StARMiner mines rules of
the form xj fi , if the conditions given in Equations 6, 7 and 8
are satisfied.
i=1
mi
where mi = yi +x i
. This distance function emphasizes the el- fi (Txj ) fi (T Txj ) min (6)
2
evated discrepancies between two feature vectors X and Y com-
pared, and measures how improbable the distribution is. fi (Txj ) max (7)
0
0 0.2 0.4 0.6 0.8 1
Recall
30 features 16 features
Recall Recall
coif1-4n db1-5n coif1 db8 sym4
db2-4n histogram coif2 sym2 sym5
db2 sym3 sym15
1 1 1
Precision
Precision
0.7
Precision
0.7 0.7
Precision
0.8 of resolution
with: (a) 4 levels of resolution; (b) 5 levels of resolution; and (c) 6 levels
1 0.7
Feature Extractor - Texture
L2
1
Feature Extractor - Texture
Linf
1 L2 Starminer
These methods are well-suited to represent the
Weighted L2 k=0
Weighted L2 k=1
images
Linf Starminer
Weighted Linf k=0
Weighted Linf k=1
under
0.8 0.6evaluation, since the precision
0.8 values are over than 80% for all re-
0.9 call values smaller than 65%. It is important to emphasize that the
0.5regions of low recall 0.6are the most important into a CBIR system
Precision (%)
Precision (%)
0.6
Feature Extractor - Texture
0.8
Precision
1
L1 because k-nearest queries usually dont search for high values of k.
L1 Starminer
Weighted L1 k=0
Weighted L1 k=1
0.4 0Thus, by0.2
comparing0.4
0.4
the results0.6provided0.8by the feature 1 vectors
0.8
0.7 with 256 and 64 elements, Recall
where the smaller one brings a more
0.2 0.2
accurate image set, we can conclude that the dimensionality curse
Precision (%)
0.6
0.6 0 coif1
really damages the results.
0
db8
This happens because sym4 the irrelevant fea-
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
0.4
Recall (%) coif2
tures sym2
disturb the influence of the relevantRecall sym5
ones.
(%) Moreover, the appli-
0.5 db2of wavelet transform
cation sym3 in 5 levels through sym15 the multiresolution
0.2 method reduced the redundancy of information from data, and it
0 0.2 0.4 0.6 0.8 1 also well represents the images for executing similarity queries.
0
0 0.2 0.4 0.6 0.8 1 Recall
Recall (%)
4.2 Experiment 2 - The 704 Image Dataset
coif1 db1 db8
A larger image dataset, with 704 MR images, which is classified
coif2 db2
in eight categories was used herein. The number of the images in
Figure 5: Precision vs. Recall curves showing the retrieval be- the dataset regarding each category is: Angiogram (36), MR Axial
havior of the proposed method with 5 resolution levels using 64 Pelvis (86), MR Axial Head (155), MR Sagittal Head (258), MR
features with the L2 metric Coronal Abdomen (23), MR Sagittal Spine (59), MR Axial Ab-
domen (51) and MR Coronal Head (36). Figure 8 shows one image
from each category. As in the previous experiment the best result
in data dimensionality. The queries performed by using the db1 was obtained applying a Daubechies wavelet transform, and, ac-
feature vectors with 5 level of resolution gives precision levels up cording to Wang [16], the Daubechies wavelet achieves excellent
to 82,55% regarding the images histogram, to queries that ask up results in image processing due to its properties. Then we use sev-
to 90% of the images. eral wavelets of the family of Daubechies on 4, 5 and 6 levels of
resolution by the multiresolution method.
1
0.8
Precision
Precision
the curves generated in 4 and 5 levels of resolution are equivalent, 0.6
even exiting a large difference between the number of features from
their vectors. For a 6 levels of resolution we still have an excellent
result (see db1-6n-16 curve), as with just 16 features, the precision 0.4
is over than 80% for values of recall up to 90%. And comparing
the 256 features with the 16 ones, we have a dimensionality reduc- 0.2
tion of 93.75%. Also, note that the larger the number of filters, the
smaller the precision of the queries, considering the same level of 0
resolution. These results suggest that relevant features are removed 0 0.2 0.4 0.6 0.8 1
from the image when we use filters that are greater than 2.
Recall
1 db1-4n db1-6n
db1-5n histogram
0.8
Figure 10: Precision vs. Recall graphs showing the retrieval
behavior of the best curves and gray-level histogram with L2
Precision
0.6 metric
0.4 D1 x01 y01 m1 1 b1 ... DL x0L y0L mL L bL
0.2
Featuresofthetextureclass1 FeaturesofthetextureclassL
Figure 9: Precision vs. Recall graphs generated by several 0.6 of EM/MPM, which is one
tors extracted by the improved version
Daubechies wavelets transforms in 4, 5 and 6 levels of multires- of the best method in the literature. We can see that our method
olution with L2 metric performs better when processing0.4 similarity queries (k-NN). Note
also that, our method demands fewer features than the EM/MPM.
Figure 10 shows the best curves of Precision vs. Recall with 256, We can also compare the time spending to process the image and
64 and 16 features, respectively, from the Figure 9 and compare to extract the features. While the0.2improved version of EM/MPM
them with the curve given by the gray-level histogram. It is clear spends around 17.05 seconds per image, our method with 16 fea-
to see that all three methods have a better performance than the 0
tures spends around 0.77 seconds. Even selecting the most relevant
histogram. Numerically, we get an improvement of precision up features by using StARMiner, our method0 0.2 better0.4
performs than the 0.6
to 531% to values of recall up to 95%, for the feature vector with improved version of EM/MPM with 16 features. Recall
256 features. For 64 features, the improvement in precision is up
to 528% to a recall of 95%; and for 16 features, the improvement 4.3 Experiment 3 - The ALOI Image Dataset
30 features 16 fea
1 of 90%.
in precision is up to 491% to a recall The third image dataset employed in the experiments is the Am-
To compare this method with another one from literature, we sterdam Library of Object Images (ALOI) [8], a gray image collec-
used a technique proposed by Balan 0.8 [3], which employs an im- tion of one-thousand small objects, recorded for scientific purposes
proved version of the EM/MPM method to segment images, and in several configurations. The ALOI-ILL dataset consists of 24,000
for each region segmented based on texture, six features were ex- images in gray-scale recorded under varying 24 different illumina-
Precision
tracted: the mass (m); the centroid0.6(xo and yo); the average gray tion angle of each object. The images are represented by 8 bits,
level (), the Fractal dimension (D); and the linear coefficient used resulting in 256 gray-levels and comprising dimensions of 384 x
to estimate D (b). Therefore, when 0.4 an image is segmented in L 288 pixels. Figure 13 shows a sample of 10 elements of this data-
classes, the feature vector has L * 6 elements. Here we use L = 5, set. As in the previous experiments the best results were obtained
so the feature vector has 30 features.
0.2 Figure 11 illustrates the fea- using the db1 wavelet, we applied our method with the same one.
ture vector described. In this experiment, the Precision vs. Recall curves are built as
Over these image feature vectors was executed the data mining the average plots of 240-nearest neighbor queries, using 20% of
0
algorithm StARMiner, which was described in Section 2.4.1. The the dataset. To evaluate the correctness of each image retrieved, we
0.4 0.6 0.8 1
rules mined identify the most relevant 0 image features,
0.2 0.4
generating 0.6 used 0.8 1
the object portrayed in the image as a class attribute, therefore
Recall Recall
Pr
Precision
0.4
0.4
0.2
0.2
1
1 0
0
0
0 0.2 0.4 0.6 0.8 1 0.8
0.8 Recall
5n - 108 features 6n - 30 features
0.6
Precision
0.6 5n - 60 features SM 6n - 20 features SM
Precision
0.4
0.4
1
0.2 0.9
0.2
0.8 1 30 features (EM/MPM)
Precision
15 features (StARMiner) 0.8
16 features (db16n) 0
56 0 0 0.2 0.4 0.6 0.8 0.7 1
0 0.2 0.4 0.6 0.8 1 Recall
64
0.6
56 Recall
64 6n - 27 features 5n - 108 features Histogram
0.5the ALOI-
Figure 15: Precision vs. Recall graphs obtained over
56 Figure 12: Precision vs. Recall graphs from db1-6n with 16
64 features and from improved version of the EM/MPM ILL dataset for each category employing: db1 with0 5 and0.26 0.4
1 levels of decomposition with 2 and the StARMiner feature se- Rec
lection
coif1 db8
0.8 coif2 sym2
db2
built by executing similarity queries employing: (1) the proposed sym3
0.6 feature vector comprising 27 features, using db1 with 6 levels of
decomposition; (2) the proposed feature vector comprising 108 fea-
0.4 tures, using db1 with 5 levels of decomposition and (3) the features
obtained by using the traditional gray-level histograms, each vector
0.2 comprising 256 features, since the images are represented by 8 bits.
Figure 13: Examples of images from the dataset Thus, analysing the graphs of Figure 15, we can notice that both
0 instances of the proposed feature vectors (with 27 features and
0 0.2 0.4 0.6 0.8 1
108 features) presented a considerable improvement in precision,
performing a supervised automated evaluation of the Recall algorithm. up to 75% and 125% for a recall level of 45% and 50% respec-
For each query, the images considered for the computation of pre- tively in comparison with the precision achieved by the traditional
cision are those related to the same classdb1-4n
in the dataset. The plots db1-6n gray-level histogram. Hence, these results testify that the proposed
db1-5n histogram
show the average result for all images of the dataset as query im- feature vector improves the precision of similarity queries. In such
ages. case, we can argue that our proposed feature vector is well-suited to
We evaluated the L1 , L2 , L , Jeffrey Divergence, Canberra and content-based medical images, as well as to a generic image dataset
2 distance functions. The best results were obtained using the 2 like the ALOI-ILL. In addition to the improvement of precision, it
distance function, as we can notice in Figure 14. is important to highlight also that the method performs a notable
dimensionality reduction in comparison with the dimensionality of
1
the traditional gray-level histograms (e.g. 256 dimensions), up to
90% and 58% considering respectively the db1 with 6 levels of de-
0.8 composition (i.e. Figure 15 curve 6n-27 features) and with 5 levels
(i.e. Figure 15 curve 5n-108 features).
0.6 In Figure 16 is illustrated the Precision vs. Recall graphs ob-
Precision
0.8
cision
0.6
1 [2] R. A. Baeza-Yates and B. A. Ribeiro-Neto. Modern
Information Retrieval. Addison-Wesley, Wokingham, UK,
1
1999.
0.8
s [3] A. G. R. Balan, A. J. M. Traina, C. Traina Jr., and P. M. d. A.
M Marques. Fractal analysis of image textures for indexing and
Precision