Professional Documents
Culture Documents
Expert Systems With Applications: Yi-Leh Wu, Cheng-Yuan Tang, Maw-Kae Hor, Pei-Fen Wu
Expert Systems With Applications: Yi-Leh Wu, Cheng-Yuan Tang, Maw-Kae Hor, Pei-Fen Wu
a r t i c l e i n f o a b s t r a c t
Keywords: Feature selection plays an important role in image retrieval systems. The better selection of features usu-
Feature selection ally results in higher retrieval accuracy. This work tries to select the best feature set from a total of 78 low
Image retrieval level image features, including regional, color, and textual features, using the genetic algorithms (GA).
Genetic algorithms However, the GA is known to be slow to converge. In this work we propose two directions to improve
Taguchi method
the convergence time of the GA. First we employ the Taguchi method to reduce the number of necessary
Hubert’s C statistics
offspring to be tested in every generation in the GA. Second we propose to use an alternative measure, the
Hubert’s C statistics, to evaluate the fitness of each offspring instead of evaluating the retrieval accuracy
directly. The experiment results show that the proposed techniques improve the feature selection results
by using the GA in both time and accuracy.
Ó 2010 Elsevier Ltd. All rights reserved.
1. Introduction The first category represents the regional information which in-
cludes the region position, the circumference, the area, etc. The
Today’s image retrieval systems usually employ low level color, second category represents the color information which includes
shape, and textual features to represent the contents in a given im- the lab, the invariant moments, the color moments, the color
age. However, these low level image features are usually not con- coherence vectors, etc. The third category represents the textual
sistent with human perception in semantic such results in less information which includes the edge orientation histogram, the
satisfactory image retrieval performance. In recent years many re- edge density, the anisotropy, the contrast, etc. A total of 78 image
searches have proposed more human visual system aware image features are included in the initial image features set. The similar-
retrieval systems such as the Region-Based Image Retrieval (RBIR) ity of two given image regions is computed by the Euclidean dis-
systems (Carson, Belongie, Greenspan, & Malik, 2002; Li, Dai, Xu, & tance of their corresponding feature vectors.
Er, 2008; Ma & Manjunath, 1997; Wang, 2001; Wang, Li, & Wieder- To improve the accuracy of the image retrieval systems, it is
hold, 2001) that employ the objects or similar image regions as the important to have a proper image feature set that describes the
basis for image retrieval. When considering the whole image as the contents of an image. The more suitable image features set results
retrieval target, if there are many objects in the images or the im- in higher retrieval accuracy. The main contributions of this work
age backgrounds are not related with the foreground objects, the are summarized as follows:
retrieval results will be dissatisfactory. Image retrieval systems
based on RBIR include: Berkeley Blobworld (Carson et al., 2002), We propose to employ the Hybrid Taguchi-Genetic Algorithm
UCSB Netra (Ma & Manjunath, 1997), SIMPLIcity (Wang, 2001), etc. (HTGA) to perform feature section for the RBIR systems.
The RBIR systems first segment images into many regions then Instead of using the direct retrieval accuracy, which is expen-
extract image features from each segmented region. Each region is sive to compute, to select better offsprings in every generations
represented by a feature vector. Feature vectors may have different of the HTGA, we propose to use the Hubert’s C statistic, which
dimensionality depending on the number of image features to rep- estimates the cluster validity, as the fitness measure to select
resent the given region. In this work, we employ the Blobworld better offspring.
(Carson et al., 2002) method to segment images. For each seg- We propose to use the Halton quasi-random sampling method
mented region we then extract the low level image features. Our that greatly reduces the computation time of the Hubert’s C
initial image features set includes three categories of features. statistic.
⇑ Corresponding author. Our experiment results support that the proposed improve-
E-mail addresses: ywu@csie.ntust.edu.tw (Y.-L. Wu), cytang@cc.hfu.edu.tw (C.-Y.
ments over the original GA can perform feature set selection effi-
Tang), hor@cs.nccu.edu.tw (M.-K. Hor). ciently from a large image features set.
0957-4174/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2010.08.062
2728 Y.-L. Wu et al. / Expert Systems with Applications 38 (2011) 2727–2732
c
2. The Hybrid Taguchi-Genetic Algorithm (HTGA) La ðb Þ;
where a is the number of experimental runs; b is the number of lev-
Taguchi method (Tsai, Liu, & Chou, 2004) is commonly applied els for each factor; c is the number of columns in the orthogonal
in quality management to improve the quality and the stability array.
in production. Taguchi method can reduce the environment influ- The quality of the OA will greatly influence the accuracy and the
ence in production. From the manufacturing cost point of view, objectivity of the experiments. To construct a qualified OA we use
Taguchi method allows the use of low grade materials and not the following general principles:
expensive equipments while maintaining certain quality level.
From the developing cost point of view, Taguchi method wants 1. All factors are assumed to be independent:
to shorten construction period and to reduce required resources. During the process of numerical calculation, we do not take the
Taguchi method is a robust design which has the characteristic interrelation among factors into account. If there are dependent
of high quality, low developing, and low cost. The two major tools factors, we create a polynomial term to represent the factors
employed by the Taguchi method are the orthogonal array (OA) that are dependent; e.g., if factor A and factor B are dependent,
and the signal–noise ratio (SNR). We briefly discuss the orthogonal we create a new factor A * B separately as an independent
array and the signal–noise ratio as follows: factor.
2. The number of appearances at each level must be equal:
2.1. The orthogonal array (OA) To maintain objectivity of the experiments, the occurrences of
levels must be equal; e.g., in a level-2 orthogonal array, if factor
In factor design, when the number of factors increases, the num- 1 has four 0’s, then it must also have four 1’s to preserve
ber of experiments required increases. Taguchi method utilizes the objectivity.
OA to collect the experimental data directly and the result is a more 3. The stronger the orthogonal array, the more reliable the exper-
robust factor estimator with fewer number of experiments re- iment results. However, stronger orthogonal arrays are harder
quired. The OA is an important tool to conduct a robust experiment to construct and require more experiments:
design. A general orthogonal array is denoted as follows: The strength of an orthogonal array is defined as follows. An OA
of level-2 (only 1 and 2 appears) and strength 3 has the charac-
teristic that by selecting any three columns there must exist at
Table 1 least one of the following eight combinations (111, 112, 121,
L8 (24) orthogonal array.
122, 211, 212, 221, 222). A sample OA of level 2 and strength
Run Factor 3 is shown in Table 1.
A B C D
2.2. The signal–noise ratio (SNR)
1 1 1 1 1
2 1 1 2 2
3 1 2 1 2 Taguch method employs the SNR to estimate the contribution
4 1 2 2 1 degree to the objective function of each factor in each level. Formu-
5 2 1 1 2 lation of the SNR is derived from the unbiasedness in statistics. It is
6 2 1 2 1
7 2 2 1 1
an estimate of how samples deviate from the center of population.
8 2 2 2 2 The general formulation of the SNR is as follows:
0010 0001
Replacement
0000 0000
0101 0110
0001
Crossover
Taguchi Method
Mutation
mÞ2 S2 ;
ðy 1988), etc. The steps for testing the validity of a clustering structure
are as follows:
is the mean of sample, m is the mean of object, S is the stan-
where y
dard deviation of sample. Step 1: Select the clustering structure, the validation criteria and
"P # the index.
n 2
i¼1 ðyi mÞ
mÞ2 S2 ¼ 10 log
SNR ¼ 10 log½ðy ; Step 2: Obtain the distribution of the index under the no structure
n hypothesis.
Step 3: Compute the index for the clustering structure.
where n is the number of samples in the population.
Step 4: Statistically test the no structure hypothesis of by deter-
In 2004, Tsai et al. proposed the Hybrid Taguchi-Genetic Algo-
mining whether the index from Step 3 is unusually large
rithm (HTGA) (Tsai et al., 2004) that combined the Taguchi method
or unusually small.
and the GA which results in faster convergence speed. The main
difference of the HTGA and the original GA is that the offspring
after the crossover operation need to pass an additional Taguchi 3.1. The Hubert’s C statistic
method test which results the optimal offspring in each genera-
tions. A diagram of the HTGA is shown in Fig. 1. Through this opti- To validate a computed clustering structure one can compare it
mization process, the GA can converge early and can improve the to an a priori structure. The Hubert’s C statistic was designed to
precision. The HTGA is detailed as follows: measure the fitness between data and a priori structures. Let
X = [X(i, j)] and Y = [Y(i, j)] be two n n proximity matrices on n
1. Initialization (parameter setting): The population size is M chro- objects. X(i, j) is the observed proximity between objects i and j.
mosomes, the crossover rate is PC, the mutation rate is PM, and Y(i, j) is defined as:
the number of generations is N.
0 if objects i and j belong to the same category;
2. Fitness: Calculate the objective value of each individual and the Yði; jÞ ¼
fitness value of each population. 1 if not:
3. Selection: Use the roulette wheel approach or other similar The Hubert’s C statistic is the measurement of point by point
methods to select the individuals with higher fitness to perform correlation between the two matrices X and Y. When X and Y
crossover. are symmetric, we have
4. Crossover: Determine by the probability of PC, select the set of
individuals that should crossover. From the set we select two X
n1 X
n
individuals at random then apply the one-cut-point method C¼ Xði; jÞYði; jÞ:
i¼1 j¼iþ1
to generate two offspring.
5. Taguchi test: With a 2-level orthogonal array appropriate for However, the C computed from the above equation is in its raw
our experiment, we take the offspring of step 4 and calculate form. To normalize C statistic, we have
their fitness and SNR. We then calculate the effective degree ( )
of each factor in the objective function to generate the best X
n1 X
n
C ¼ ð1=MÞ ½Xði; jÞ mx ½Yði; jÞ my =Sx Sy
offspring.
i¼1 j¼iþ1
6. Repeat steps 3 and 4 until the number of better offspring
reaches 1/2 * M * PC. where M = n(n 1)/2, mx and my denote the sample means of the
7. Mutation: The probability of mutation is determined by the entities in X and Y and Sx and Sy denote the sample standard devi-
mutation rate PM. ations of the entities in X and Y. The normalized C statistic will have
8. Replacement: Sort the parents and offspring by their fitness the range between 1 and 1. If the two matrices agree with each
measures. Then select the best M chromosomes as the parents other in structure then the absolute value of C statistic will be
of the next generation. unusually large. One of the most common applications of C statistic
9. Repeat steps 2–8, until one of the following two stopping con- is to test the random label hypothesis; i.e., could the values in one of
ditions is met. the two matrices X and Y have been inserted at random? To test the
random label hypothesis, the distribution of C under the random la-
HTGA converges to the optimal solution, or bel hypothesis must be known in advance. This distribution is the
The number of execution generations exceeds the pre-defined accumulated histogram of C with all n! permutations of the row
threshold. and column numbers of Y.
Cluster validity measures the adequacy of a structure recovered When testing whether the value of C is unusually large, the dis-
by cluster analysis that can be interpreted objectively. The ade- tribution must be found by evaluating C for all n! permutations in
quacy of a clustering structure refers to which the clustering advance. However, with six objects, 6! = 720 values of C must be
structure reflects the intrinsic character of the data (Bel Mufti, computed, and with nine objects, 9! = 362, 880 values must be
Bertrand, & El Moubarki, 2005; Dubes, 1993; Halkidi, Batistakis, found. It leads to a computationally complex procedure. We pro-
& Vazirgiannis, 2002; Jain & Dubes, 1988; Liu, Jiang, & Kot, 2009; pose to employ the Halton Quasi-random Numbers technique
Santos, Marques de Sa, & Alexandre, 2008). In general, there are (Press, Teukolsky, Vetterling, & Flannery, 2002) as a solution to this
three criteria to investigate cluster validity, namely external, inter- high computational problem. The random samples generated by
nal, and relative. The hypothesis tests are used to determine if a the Halton Quasi-random Numbers technique can distribute uni-
recovered structure is appropriate for the data. When the external formly in n-dimensional space.
and internal criteria are used, the hypothesis tests are to test We employ the Halton Quasi-random Numbers technique to re-
whether the value of the index is either very large or very small. duce computation by generating sample distributions for Hubert’s
Many statistical tools can be employed for cluster validity; e.g., C statistic. The distributions of the Halton Quasi-random Numbers
Monte Carlo, Hubert’s C and Goodman–Kruskal c (Jain & Dubes, in a two dimensional space are shown in Fig. 2.
2730 Y.-L. Wu et al. / Expert Systems with Applications 38 (2011) 2727–2732
Table 2 Table 5
Initial image features set. Details of the 15 feature selection experiments.
5. Experiments
Table 3
Feature selection results.
Table 4
Feature selection results (sorted).
2732 Y.-L. Wu et al. / Expert Systems with Applications 38 (2011) 2727–2732
Table 6 ment results also suggest that the proposed method can select
Comparison of indexing accuracy. smaller image features set and produce higher retrieval accuracy.
Acknowledgements