Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Pattern Recognition 60 (2016) 106–120

Contents lists available at ScienceDirect

Pattern Recognition
journal homepage: www.elsevier.com/locate/pr

Learning feature fusion strategies for various image types to detect


salient objects
Muhammad Iqbal n, Syed S. Naqvi, Will N. Browne, Christopher Hollitt, Mengjie Zhang
School of Engineering and Computer Science, Victoria University of Wellington, PO Box 600, Wellington 6140, New Zealand

art ic l e i nf o a b s t r a c t

Article history: Salient object detection is the task of automatically localizing objects of interests in a scene by sup-
Received 7 March 2016 pressing the background information, which facilitates various machine vision applications such as
Received in revised form object segmentation, recognition and tracking. Combining features from different feature-modalities has
20 April 2016
been demonstrated to enhance the performance of saliency prediction algorithms and different feature
Accepted 4 May 2016
Available online 21 May 2016
combinations are often suited to different types of images. However, existing saliency learning techni-
ques attempt to apply a single feature combination across all image types and thus lose generalization in
Keywords: the test phase when considering unseen images. Learning classifier systems (LCSs) are an evolutionary
Object Detection machine learning technique that evolve a set of rules, based on a niched genetic reproduction, which
Saliency Map
collectively solve the problem. It is hypothesized that the LCS technique has the ability to autonomously
Learning Classifier Systems
learn different feature combinations for different image types. Hence, this paper further investigates the
XCS
Pattern Recognition application of LCS for learning image dependent feature fusion strategies for the task of salient object
detection. The obtained results show that the proposed method outperforms, through evolving gen-
eralized rules to compute saliency maps, the individual feature based methods and seven combinatorial
techniques in detecting salient objects from three well known benchmark datasets of various types and
difficulty levels.
& 2016 Elsevier Ltd. All rights reserved.

1. Introduction detection concentrate on constructing deterministic tailor-made


features [15,16] such as color or color gradient and apply heur-
Visual saliency has recently attracted much computer vision istics to combine them. A class of models [17–19] use low, mid
research, giving birth to a new sub-domain known as salient ob- and high-level features to learn a single set of weighting para-
ject detection [1]. For salient object detection, the task is to detect meters for combining features, but apply them across multiple
the salient, attention grabbing object(s) in a scene and subse- types of images, e.g. images with cluttered backgrounds or
quently segment it in its entirety [2,3]. It is similar to the problem multiple objects of interest. Therefore, such techniques in-
of figure-ground segmentation [4–6], but differs from the tradi- herently lose generalization when operated on test sets with
tional segmentation problem as the task is simply to find the most different images having various properties and sets of features.
salient object rather than completely partitioning the image into An alternative approach is to learn model parameters using an
perceptually homogeneous regions [7]. Salient object detection is assembly of weak learners, which increase generalization. How-
actually the task of marking regions of interest in a scene, which ever the quality of final solution depends upon the performance
facilitates various computer vision applications, e.g. image seg- of individual learners and can be degraded if one of the learners
mentation [8], image retrieval [9,10], picture collage [11,12], object
is not optimal [20].
recognition [13] or image compression [14]. A learning classifier system (LCS) is a rule-based machine
Most methods specialized for the task of salient object
learning technique in which each rule relates sections of the
feature space with a classification and a measure of accuracy
n
Corresponding author. [21,22]. To address the issue of loss in generalization on unseen
E-mail addresses: muhammad.iqbal@ecs.vuw.ac.nz (M. Iqbal), image types and to make the system general for all image types,
syed.saud.naqvi@ecs.vuw.ac.nz (S.S. Naqvi),
previously we utilized the strength of LCS to autonomously di-
will.browne@ecs.vuw.ac.nz (W.N. Browne),
christopher.hollitt@ecs.vuw.ac.nz (C. Hollitt), vide the feature space into niches and construct rules covering
mengjie.zhang@ecs.vuw.ac.nz (M. Zhang). each image type [23]. The aim of this paper is to extend and

http://dx.doi.org/10.1016/j.patcog.2016.05.020
0031-3203/& 2016 Elsevier Ltd. All rights reserved.
M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120 107

demonstrate the LCS technique proposed in [23] by fully in- optimization problems [24–31].
vestigating niching of image types and demonstrating perfor- The proposed LCS method, to be presented in this study, to
mance on a wide range of domains and salient object detection compute saliency maps is based on XCS [32], which is a well-
benchmark techniques. tested LCS model. In XCS, the learning agent evolves a population
The rest of the paper is organized as follows. Section 2 briefly [P ] of classifiers, as depicted in Fig. 1, where each classifier con-
describes the related work in salient object detection. In Section 3 sists of a rule and a set of associated parameters estimating the
the proposed LCS technique to detect salient objects in an image is quality of the rule. Each rule is of the form ‘if condition then ac-
detailed. Section 4 introduces the datasets, parameter settings, and tion’, where condition is used to match input observations, and
performance measures used in the experimentation. In Section 5 the corresponding action predict the class label for a given ob-
experimental results are presented and compared with existing servation. Commonly, the condition in a rule is represented by a
state-of-the-art systems. Section 6 provides an analysis of the conjunction of predicates using one predicate for each corre-
evolved classifier rules obtained using the proposed LCS system. In sponding input feature; and the action is represented by a nu-
the ending section this work is concluded and the future work is meric constant.
outlined. In XCS, on receiving the environmental input state s, a match
set [M ] is formed consisting of the classifiers from the population
[P ] that have conditions matching the input s. For every action ai in
the set of all possible actions, if ai is not represented in [M ] then a
2. Background
covering classifier is randomly generated. After that an action a is
selected to be performed on the environment and an action set [A]
This section introduces the necessary background in learning
is formed, which consists of the classifiers in [M ] that advocate a.
classifier systems, and the related work in salient object detection.
After receiving an environmental reward, the associated para-
meters of all classifiers in [A] are updated. When appropriate, new
2.1. Learning classifier systems classifiers are produced using an evolutionary mechanism, usually
a GA. Additionally, in XCS overly specific classifiers may be sub-
Traditionally, an LCS represents a rule-based agent that in- sumed by any more general and accurate classifiers in order to
corporates a genetic algorithm (GA) and machine learning to solve reduce the number of classifiers in the final population [33]. For a
a given task by evolving a population of interpretable classifiers. complete description, the interested reader is referred to the ori-
Each classifier covers a part of the feature space that may be ginal XCS papers by Wilson [32,34], and to the algorithmic details
overlapped with other classifiers. The LCS technique has been by Butz and Wilson [35].
successfully applied to a wide range of problems including clas-
sification, data mining, control, modeling, image processing and 2.2. Salient object detection

Visual attention is a fundamental research problem in psy-


chology, neuroscience, and computer vision literature. Researchers
have built computational models of visual attention to predict
where humans are likely to fixate [36]. Recently, this work has
been expanded to identify salient objects in a scene for object
detection and localization. Salient object detection is a difficult
problem in computer vision as natural scenes can include objects
with cluttered backgrounds (making it difficult to distinguish the
object from background based on its features) and scenes con-
taining multiple objects.
Deterministic methods to detect salient objects include fine
human-constructed features, but they usually combine them lin-
early, thus neglecting the importance of individual features [16].
Machine learning approaches have the ability to learn feature
importance during combination, which enhances their perfor-
mance in challenging cases such as scenes with cluttered back-
grounds and multiple objects [37].
Tong et al. [38] used 73 texture and color features by exploring
both global and local cues to compute a saliency map. However,
the simplistic nature of feature combination (i.e., the average of
the local and global features) compromises the final saliency
output on difficult cases of saliency detection. Judd et al. [17]
learned a model of saliency from 33 features (including low, mid
and high level features) to predict human eye fixations. They used
support vector machines (SVMs) with linear kernels to learn fea-
ture weightings, while Zhao and Koch [19] used least square re-
gression to learn weights for eye fixation prediction using basic
saliency features (i.e., color, intensity and orientation). Both the
discriminative approaches lose generalization on a subset of
images due to a single weighting scheme being applied to features
Fig. 1. Overview of a learning classifier system [28]. for all image types. Singh et al. [39] applied a constrained Particle
108 M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120

Swarm Optimization method to determine an optimal weight generalization of the overall system [42]. Once again, AdaBoost
vector to combine different features to obtain a saliency map. The does not divide the feature space into niches depending upon
single solution set learned after the evolutionary process hinders image types, which affects generalization on unseen images. Due
the generalization performance of their method on the large set of to cooperative nature of the evolved rules, the LCS technique has
testing images. an inherent capability to autonomously divide the feature space
The recent discriminative regional feature integration (DRFI) into niches. Previously we adapted a supervised classifier system,
work of Jiang et al. [7] employs regression to automatically select known as XCS with Computed Action (XCSCA) [43], to learn dif-
and integrate features. The inclusion of high dimensional features ferent combinations of image features and construct rules cov-
for learning a regression model enables their approach to dis- ering each image type [23]. This study substantiates that work by
cover discriminative features and achieve robust performance on fully investigating the niching of different image types and de-
unseen data. However, computing high dimensional features and monstrating performance on a wide range of domains and salient
running a regressor for each image to increase generalization object detection benchmark techniques.
comes at a cost of additional computational time and limits
scalability with increasing number of regions per image. Despite
the inclusion of multiple segmentations at different scales, the 3. Salient object detection using learning classifier systems
region based saliency computation is prone to inherent limita-
tions, such as inappropriate annotation and non-uniform saliency Commonly, salient objects in an image are detected by determining
assignment. Additionally, in scenarios where feature perfor- a saliency map for the input image, which is a computed image that
mance is heavily dependent upon important feature related attempts to emphasize the object(s) to be detected. In a supervised
parameters, joint optimization of such parameters cannot be learning system, it is necessary to provide the target saliency map (i.e.,
easily incorporated into the automatic feature integration the ground truth) along with the input image during training. A ground
process. truth is a manually segmented binary image that emphasizes the ob-
Moreover, related work on alternating optimization for multi- ject to be detected. During the testing process, only the image features
view varied feature fusion has been proposed in past works that computed using the input image are provided to the system. The
takes into account the complementary information of multiview training and testing processes used in the proposed XCSCA-based ap-
data [40,41]. proach to detecting salient objects are briefly described in Algorithms
All the above mentioned approaches achieve reasonably good 1 and 2, respectively.
results, however they lose generalization capability as they only
Algorithm 1. The training process in the proposed approach. Here
learn a single set of weights to combine features for all image
maxProbs denotes the maximum number of training instances,
types. AdaBoost [18,20] learns the task of salient object detection
which is usually greater than the number of images in the training
using an assembly of weak learners (hence increasing general-
set S because a single image can be used more than once as a
ization). However the quality of final solution depends upon the
training instance.
performance of individual learners and can be affected drastically
by one of the learners in the decision tree, badly affecting the
M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120 109

Algorithm 2. The testing process in the proposed approach. color component of the GMM. Next, horizontal and vertical
variances of each color component of the GMM are calculated

and added to obtain the total variations of colors in the spatial


domain. Finally, the colors having high total variance are
The remaining of this section explains the new methods in-
assigned low saliency, while the colors exhibiting low variance
corporated in XCSCA to compute saliency maps in an effort to
in the spatial domain are assigned high saliency values.
detect salient objects from various types of images by learning  f 1: A global feature that captures the contrast between clusters
different feature fusion strategies. The new extensions introduced obtained through k-means segmentation, inspired by the work
in the proposed technique are: the design of input features, the of Fu et al. [45]. The contrast cue of a cluster is computed by
mechanism to match an input image with a population of classi- accumulating its distance to all other clusters.
fiers, the mechanism to compute actions that are essentially the  f 2: A global feature computing the spatial distribution of pixels
saliency maps, and the error function to calculate an error be- in a cluster with respect to the image center, inspired by the
tween a computed saliency map and the target saliency map. work of Fu et al. [45]. The global spatial distribution of pixels in
a cluster from the image center is measured as the mean of
3.1. Design of input features Euclidean distance between pixels in a cluster to the image
center.
Instead of matching the input image at pixel-level against  f 3: A region based feature that computes the global contrast
conditions of classifier rules, we will compute the following nine between spatial neighboring regions only [8]. The image is first
saliency-based features for each input image. Each of this produces segmented using graph-based image segmentation. Next, a
a two-dimensional real-valued arrays. We carefully select potential quantized color histogram is constructed for each region. After-
features from previous work [44,45,8,46,47] that are suited for the wards, saliency for each region is computed as its weighted
task of salient object detection and have been previously shown to color contrast to all other regions in the image. The weights are
the number of pixels contained by a region to emphasize
correlate with visual attention in different experimental settings.
contrast to larger regions, while the contrast itself is measured
To thoroughly evaluate the learning performance of models, we
by the color distance metric between the two regions.
have chosen features that complement each other well, with each
 f 4, f 5: Two low-level region-based color features adapted from
one performing better than others for a particular image type in
the work of Naqvi et al. [47]. One color feature for each region is
past experiences.
computed by accumulating the earth mover's distance (EMD) of
its LAB histogram from the histograms of all other regions in the
 f 0: A global feature that assigns low saliency to colors that vary
image, while the other is computed by measuring EMD between
a lot in the spatial domain, which is based on the work of Liu
the histograms of a region and its neighboring regions only.
et al. [1]. To compute the spatial variance of colors in an image,
Additionally, contrast from the boundary regions is also
all the colors are modeled by Gaussian Mixture Models (GMM).
exploited in these features to enhance their discriminative
Afterwards, each pixel in the spatial domain is assigned to a
110 M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120

power. that have their pixels both inside and outside the boundary of
 f 6: A mid-level feature that uses the objectness of image win- the window. The number of such super-pixels that have their
dows to highlight salient objects, based on the work of Alexe pixels both inside and outside the window boundary must be
et al. [46]. The objectness measure for a window is based on low for a window having a high probability of containing the
four image cues. The first cue is multi-scale image saliency object(s).
based on the work of Hou and Zhang [48]; the second cue is the  f 7: A feature that groups regions based on their objectness score.
color contrast of a window from its surrounding regions, Similar regions in terms of objectness scores are merged to form
computed as the Chi-square distance between the Lab histo- a larger region. For each region, its difference from all other
grams of the window and its surrounding super-pixels; the regions is computed in terms of objectness scores to form a
third cue is computed by measuring the edge density inside a difference matrix of size equal to the square of the number of
window; finally, the last cue counts the number of super-pixels regions. From the difference matrix a global threshold is cal-
culated by finding the smallest difference that exist between
neighbors. Afterwards, a local process compares regions only
with their neighboring regions and groups those having a dif-
ference less than the global threshold by assigning them the
same objectness.
 f 8: A feature that highlights salient patterns based on the work
of Naqvi et al. [47]. The salient patterns are determined by
finding any outstanding patches that have a large distance from
neighboring patches. Match distance is employed to compute
the histogram distance between patches due to its ability to
capture cross-bin similarities/dissimilarities. In addition, intra-
patch variance is exploited for computational efficiency.

3.2. Input matching scheme

If classifier conditions in an LCS are matched directly to the


computed image features, then it will be hard to evolve any gen-
eralized classifier rules due to the large-sized two-dimensional
Fig. 2. The novel encoding scheme to match an input image against the classifier image features [26]. Therefore, in order to enable generalization,
population in order to enable generalization in classifier rules.
we introduce a novel encoding scheme to match an input image

Fig. 3. From left to right: image, ground truth, features f 0 − f 8 in the same order as listed in Section 3.

Table 1
The learning parameters used in the proposed method.

Parameter Description Value

FI Initial fitness value of a new classifier 0.01


α Fitness fall-off rate used to calculate fitness of a classifier 0.1
β Learning rate used to update fitness of a classifier 0.2
ϵ0 Error threshold in accuracy under which a classifier is considered to be accurate 10
ν Fitness exponent used to calculate fitness of a classifier 5
θGA Threshold above which the GA is applied in a match set 25
χ Probability of applying (two-point) crossover in the GA to produce two offspring 0.8
μ Probability of mutating an attribute in an offspring 0.01
θdel Experience threshold above which fitness of a classifier may be considered in its probability of deletion 20
δ Fraction of mean fitness of population below which fitness of a classifier may be considered in its probability of deletion 0.1
θsub Experience threshold above which a classifier may subsume another classifier 20
Fr Fitness reduction factor used to reduce the average fitness of two parent classifiers before assigning it to offspring 0.1
r0 A parameter used in the covering operation to determine the distribution range of the spread in each interval 0.7
m0 A parameter used in the mutation operation to modify the distribution range of intervals in offspring 0.5
x0 A constant input parameter used to compute action of a classifier 0.5
δrls A scaling factor used to initialize the covariance matrix in a classifier 100
M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120 111

Fig. 4. Comparison of individual features with XCSCA on all datasets in terms of PR curves. From top row to bottom row: MSRA, SOD, SED2 and average PR curve on all
datasets.
112 M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120

Fig. 5. Comparison of individual features with the proposed XCSCA method in terms of segmentation based measures, i.e. F-measure, precision and recall. (a) MSRA, (b) SOD,
(c) SED2 and (d) average measures on all datasets.

against classifier conditions. In this encoding scheme, each com- closer “ |P | × tournamentSize ” classifiers to be matched, where |P | is
puted image feature fi will be encoded as a real-valued constant di the number of classifiers in the population and tournamentSize is a
to be matched against classifier conditions. The real-valued con- constant parameter, usually chosen as 0.4.
stant for a feature is computed using earth mover's distance (EMD)
[49] from a two-dimensional artificial feature consisting of all 3.3. Computing actions/saliency maps
ones. EMD is a way of converting an image into a number, which is
defined as the minimum cost for transforming one histogram into In an LCS, usually, the action in a classifier rule is represented
the other such that there exists a ground distance between the by a fixed scalar value. However, to learn a task that has a large
features.1 Consequently, conditions in classifier rules will be en- number of classes, it is beneficial to use a mechanism to compute
coded as a concatenation of real-valued intervals so that the sys- actions instead of using fixed scalar actions in classifier rules
tem can evolve generalized classifiers. To aid in understanding the [43,28].
novel encoding scheme, the classifier matching mechanism is In this work, the action (the saliency map) in a classifier rule is
depicted in Fig. 2. computed as a linear function of the input image and a weight
If the current input state is not matched by any classifier in the vector w, similar to XCSCA [43]. In this function, the computed
population during the training phase, a random classifier is cre- two-dimensional features are linearly combined with the evolved
ated to cover the current input. However, if an input is not mat- weights to produce the required saliency map for the matched
ched during the testing phase, we compute Euclidean distance of input image, as given in Eq. 1. Here x0 denotes a constant input
each classifier condition from the current input; and consider the parameter. The weights are evolved using recursive least squares,
as described in [51].
1
We used the fast implementation of EMD with thresholded ground distances
based on the work of Pele and Werman [50].
SaliencyMap = w0 x 0 + ∑ wi + 1fi ; i = 0, 1, 2…8. (1)
M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120 113

Fig. 6. Comparison of methods with respect to PR curves. From top row to bottom row: MSRA, SOD, SED2 and average PR curve on all datasets.
114 M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120

Fig. 7. Comparison of methods with respect to segmentation based measures, i.e. F-measure, Precision and Recall. (a) MSRA, (b) SOD, (c) SED2 and (d) average measures on
all datasets. The methods are sorted according to their F-measure scores.

To form a match set we use the real-valued distance (denoted 4. Experiment design
by d) for each of the two-dimensional features, but to compute the
saliency map the corresponding feature itself is used. This section describes the data sets, parameter settings, per-
form measures, and the state-of-the-art methods used in this
study to compare the results of the proposed approach XCSCA.
3.4. Error function

The goal of the proposed approach is to evolve saliency maps 4.1. Data sets
for each input image such that the error between the computed
saliency map and the target saliency map is minimized. The error This work employs the following commonly used benchmark
is calculated using an error function that determines the difference data sets in the field of computer vision: Microsoft Research Asia
between the computed saliency map and the ground truth of the (MSRA) [44], Salient Object Dataset (SOD) [52], and Segmentation
input image. The calculated error is used to update the associated Evaluation Dataset (SED2) [53].
parameters in the corresponding classifiers. The error function The MSRA data set is comprised of 25,000 images in total and
used in this work is: includes ground truth annotations in the form of labeled rec-
tangles from multiple users. These ground truth annotations
TP + TN
Error = 1 − , classify multiple objects as one by placing them in a single rec-
TP + FP + TN + FN (2)
tangle and also do not cater for pixel-wise accuracy. To remedy
where TP and TN are the number of correctly classified positive these effects, a set of 1000 images were manually segmented by a
and negative pixel values (for a single image) between the com- single user to obtain binary masks [54]. These ground truth masks
puted saliency map and the ground truth. FP and FN represent the consider the effect of pixel-wise accuracy and multiple objects and
falsely classified pixel values. is widely accepted in the field of computer vision as a standard
M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120 115

Fig. 8. Visual comparison of selected saliency methods on representative images from MSRA (test images), SOD and SED2 datasets.

benchmark for saliency evaluation. All images and respective faces, persons, objects and text. The resulting classifier populations
ground truth are resized to 200  200 to leverage computational are tested on each of the three data sets. All the experiments have
efficiency. Consequently the computed feature maps are of the been repeated 30 times with a different seed in each run, where
same size. Fig. 3 shows some representative images with their the MSRA data set has been randomly divided into a training set of
ground truth and respective features from three data sets. It can be 700 and a test set of 300 images in each experiment. Each result
observed that no single feature accurately captures the ground reported in this work is average of the 30 runs.
truth.
The SOD [52] benchmark contains 300 images that are difficult 4.3. Performance measures
for salient object detection due to the cluttered backgrounds and
ambiguous salient object scenes. Boundary level ground truth in- The performance is measured using precision–recall (PR)
formation is available for all images. The SED2 [53] contains 100 curves and the segmentation quality. The PR curves are drawn by
two-object images along with pixel-wise ground truth and is a computing 256 pairs of average precision and recall values. To
subset of the segmentation evaluation database. compute the 256 pairs, the saliency maps are thresholded using
256 thresholds in the range 0 to 255 to generate 256 binary maps.
4.2. Parameter settings The binary maps are compared with the corresponding ground
truth map for each image to get 256 pairs of precision and recall
For all the experiments conducted in this study, the number of values for each image. The average of precision recall pairs over
classifiers used is 2000 and the number of training instances is the whole dataset gives the final PR curve. The increase of recall
20,000. The selection method used to select two parent classifiers values on the x-axis of the PR curves corresponds to decrease of
in the GA is tournament selection with tournament size ratio 0.4. threshold values from 255 to 0.
GA subsumption is activated whereas action set subsumption is We present the original (i.e. empirical) PR curves as well as inter-
deactivated. The remaining learning parameters and their values polated PR curves in the results section. The interpolated PR curves are
used in this study are shown in Table 1, which are commonly used computed by the method “maximum precision for each recall” [55],
in the literature, as suggested by Butz and Wilson [35]. which ensures the monotonically decreasing trend and has been
The training has been conducted on the MSRA data set only as agreed upon by researchers in the document retrieval field.
it includes images of different types, e.g. cluttered background, In order to compare the segmentation quality of different
multiple objects, large salient objects, small salient objects, and methods, the saliency maps are thresholded adaptively based on
116 M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120

Table 2
A sample of the experienced and accurate classifier rules, obtained in a typical run, using the proposed LCS system. The names of images covered (i.e. matched) by a rule are
listed under the rule.

No. Condition Weight Vector

1 [0.00 1.00] [0.03 0.68] [0.00 1.00] [0.22 1.00] [0.00 0.89] [0.00 0.44] [0.65 1.00] [0.02 0.66] 416.2, 352.9,183.8, 268.0,1111.5, 141.0,701.9,752.8,1135.5,397.1
[0.00 0.43]
0_13_13700, 0_18_18530, 0_24_24558, 0_2_2301, 0_6_6646, 10_268015474_2bd515353c, 10_52202431.dscn0209.png ,10_59242003.img_4193, 1_55_55005,
1_57_57848

2 [0.00 0.47] [0.00 0.80] [0.00 0.74] [0.00 0.82] [0.00 0.43] [0.00 0.27] [0.99 1.00] [0.00 0.77] 400.5, 72.7,980.0,49.6,272.7,141.5,1311.2,294.8, 110.6,628.8
[0.00 1.00]
0_18_18530, 0_19_19593, 0_21_21781, 0_2_2301, 0_7_7821, 10_260372546_03d18a4e9e, 10_267183829_e8538b16ff.png ,10_268122508_c361b4db6c,
10_268252475_3716432836_m, 10_268262571_a713359657, 10_41105485.05_0226_aabbspoth2, 10_43642602.pasqueflowerpasqueflowerearlyeve2,
10_49513258.0509180006.fsdm, 10_52202430.dscn0206, 10_52202431.dscn0209, 10_54628916.jan8_06_536, 10_54628917.jan8_06_537, 10_54628920.
jan8_06_540, 10_66830630.sejklgec.sept10_06_746, 1_26_26013, 1_51_51078, 1_55_55005, 1_57_57848, 1_58_58671, 1_61_61269, 1_65_65984

3 [0.00 1.00] [0.20 0.90] [0.40 1.00] [0.00 0.69] [0.00 0.93] [0.00 0.52] [0.15 1.00] [0.00 0.79] 559.7,255.1,517.9,296.3,15.9,472.2,539.3,40.2,627.7,787.9
[0.00,0.70]
0_11_11164, 0_11_11981, 0_12_12891, 0_13_13347, 0_13_13386, 0_13_13700, 0_15_15022, 0_16_16704, 0_16_16768, 0_18_18565, 0_19_19593, 0_1_1288, 0_1_1409,
0_1_1427, 0_21_21244, 0_21_21394, 0_24_24453, 0_2_2756, 0_4_4240, 0_5_5091, 0_5_5291, 0_6_6646, 0_7_7478, 0_7_7917, 0_8_8859, 0_9_9398, 0_9_9453,
10_00000069_018, 10_00000093_007, 10_144439584_97e8823d39, 10_1557631_c5d89caa41, 10_156234186_1b16bc540f, 10_161208056_248b2c2ab6,
10_224830165_1fc5dcfecf, 10_236916255_307bc2272c, 10_262725611_c3ce2b827e, 10_264977628_cde9f779bc, 10_266774040_f40061481f,
10_267183829_e8538b16ff, 10_267412727_1727822888, 10_267744690_ac99310c04, 10_267780676_e2346e581a, 10_267781224_2039c2b3fb,
10_267781788_2765356aeb, 10_267935839_30e321dbe5, 10_267975486_b9cd08a46f, 10_267976432_b4e9429cff, 10_268001846_db119074c8,
10_268014760_9676395938, 10_268129990_04e858de05, 10_268168010_d3852a2e6e, 10_268211019_270132655c, 10_268217391_a99ec26edb,
10_268231155_a6210a13b7, 10_268231902_1586ec90e2, 10_268252475_3716432836_m, 10_29912713.stream5sm, 10_36687635.gpzoo_przewalski, 10_38173647.
pbcats06, 10_40677597.img_1543, 10_43047581.ds20050506_0127awfcat, 10_44859877.10, 10_52202392.dscn0140, 10_52202445.dscn0226, 10_52202453.
dscn0243, 10_52202457.dscn0251, 10_52202466.dscn0266, 10_58229768.20060405022_object_of_fancy, 10_59242003.img_4193, 10_59271038.138_3822,
10_59271075.img_1787, 10_66830586.6whtssec.sept10_06_705, 10_96835757_c609dbaa80, 10_98048098_af08c0f2df, 1_26_26013, 1_26_26532, 1_27_27034,
1_35_35795, 1_43_43206, 1_47_47611, 1_47_47818, 10_66830586.6whtssec.sept10_06_705, 1_51_51078, 1_53_53496, 1_54_54098, 1_56_56172, 1_56_56265,
1_57_57848, 1_60_60686, 1_62_62852, 1_63_63372, 1_63_63566, 1_65_65815, 1_65_65984, 1_66_66979,1_67_67033

their average intensity values to compute precision, recall and that maintain more true positives at higher thresholds, resulting in
F-measure metrics. The F-measure in this case is a measure of completely highlighted salient objects.
accuracy of the saliency map to completely segment the whole The high variability in terms of the PR curves of the features de-
object from an image. monstrate the performance gaps amongst individual features. It is
The proposed approach is compared with the nine features used as noteworthy that the proposed XCSCA method is able to identify these
individual saliency map generators, as described in the Section 3, and huge performance gaps and improve upon individual feature perfor-
seven state-of-the-art methods. The state-of-the-art methods selected mance by minimizing these gaps. The average PR results on all data-
for comparison include Dense and Sparse Saliency (DSR) [56], Mani- sets in the bottom row in Fig. 4 demonstrate the performance gains of
fold Ranking (MR) [57], Bayesian Saliency Model (BSM) [58], Cluster- 38.8% and 2.7% achieved by the XCSCA method as compared with the
based Saliency (CS) [59], Low Rank Matrix Recovery (LRK) [60], Soft worst and best performing individual features, respectively.
Image Abstraction (SIA) [61] and Hierarchical Saliency (HS) [62]. The
benchmark methods are selected as they are recent, widely accepted 5.1.2. Segmentation quality based comparison
by the community and have made either their methods or their results Fig. 5 shows the segmentation performance of the individual fea-
available for public use. tures in comparison with the proposed XCSCA method on all datasets.
The high variability of the precision, recall and F-measure metrics can
be observed from Fig. 5. The average F-measure results on all datasets
5. Results in Fig. 5(d) show performance improvements of 29.8% and 4.5% ob-
tained by the proposed XCSCA method as compared with the worst
This section shows the performance comparison of the ob- and best performing individual features, respectively.
tained results using the proposed approach with the individual
feature based methods and seven state-of-the-art methods. 5.2. Comparison with state-of-the-art methods

5.1. Comparison with individual feature based methods 5.2.1. Precision recall curves based comparison
Fig. 6 shows the performance of seven state-of-the-art methods in
5.1.1. Precision recall curves based comparison comparison with the proposed XCSCA method on all chosen bench-
Fig. 4 shows the performance of individual features in comparison mark datasets in terms of PR curves (empirical as well as interpolated).
with the proposed XCSCA method on all chosen benchmark datasets In terms of the empirical PR curves, the proposed XCSCA method
in terms of PR curves (empirical as well as interpolated). The perfor- outperforms all the state-of-the-art techniques on all datasets. The
mance of methods is summarized and ranked by the area under the overall superior quality of the saliency maps produced by the pro-
precision recall curve (AUCPR). It can be observed from Fig. 4 that at posed method can be observed from the average PR results. It can be
0 threshold the precision values lie between 0.2 and 0.3 (at the right observed that several methods perform closer to the proposed method
extreme of the PR curves) for all datasets. This indicates that 20–30% on lower threshold values, however the proposed method maintains
pixels belong to the annotated salient objects for all datasets. At the the highest precision on higher threshold values (corresponding to
other extreme of the PR curve for lower recall values, only the pro- lower recall) as compared with the state-of-the-art methods. The in-
posed LCS method retains high precision values for most cases as creased precision in the saliency results of the proposed method (at
compared with the individual feature based methods. This behavior similar recall values) can be attributed to the appropriate weighting of
corresponds to the smoother saliency maps of the proposed method features within each niche, resulting in suppression of background
M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120 117

Fig. 9. Images grouped by three representative experienced rules from the population.

noise in cluttered scenarios. associated to learning of adaptive thresholds during the training
In terms of the interpolated PR curves, DSR performs better than stage of the proposed method.
all methods on the MSRA dataset and HS performs better than all
methods on the SOD dataset. However, on average XCSCA performs
better than all methods as shown in the bottom row in Fig. 6. 6. Further discussions

5.2.2. Segmentation quality based comparison 6.1. Qualitative comparison


Fig. 7 shows the comparison of methods in terms of the quality
of their induced segmentation for MSRA, SOD, and SED2 datasets. According to the precision recall curve based comparison, the
Fig. 7 suggests that none of the methods consistently outperforms proposed method maintained higher precision than the state-of-
all other methods on this benchmark. However, on average the the-art methods on higher thresholds. Also the segmentation re-
proposed method outperforms all the state-of-the-art methods on sults reveal that the proposed method consistently obtained the
all datasets. It is noteworthy that the proposed method has con- highest precision as compared with other methods. The robust-
sistently the highest precision as compared with all other state-of- ness of our approach in completely highlighting the salient objects
the-art methods on all datasets, suggesting that the proposed and uniform saliency assignment inside object contours as com-
method maintains the highest number of true positives inside pared with the state-of-the-art methods can be observed by the
object contours in the segmented output. The high precision and visual comparison shown in Fig. 8. In most cases, the saliency
F-measure performance of the proposed method can be ascribed maps produced by the proposed LCS method exhibit the highest
to the suppression of background noise achieved by the proposed quality of capturing the precise location of salient objects, com-
method especially on cluttered background images. Additionally, plete salient objects and effective suppression of background noise
the robust performance on this benchmark can be partly as compared with the state-of-the-art methods.
118 M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120

Group 1
Group 2

Group 3

Fig. 10. Saliency grouping results corresponding to the images in Fig. 9. (For interpretation of the references to colors in this figure, the reader is referred to the web version
of this paper.)

A noteworthy quality of the proposed saliency maps is preservation selected from the population as shown in Table 2.
of object details as shown by the third and last row of Fig. 8, which The images and corresponding saliency maps covered by each
may benefit object recognition applications. The last row of Fig. 8 rule were placed into distinct groups. To visualize image and sal-
shows an example of saliency where the proposed method matches iency groups formed by the rules, a dissimilarity matrix was cre-
the object details more precisely than depicted in the ground truth ated by utilizing the feature distances for each image. Afterwards,
and highlights both objects more evenly than state-of-the-art. The first 2D mapping and normalization of the data was performed to
and fourth rows of Fig. 8 exhibit examples of proper highlighting of display the images and saliency maps belonging to a group in the
complete salient object and suppression of background noise as form of an embedded image. As numerous images were covered
compared with the state-of-the-art techniques. The fourth row image by each individual rule, it is not feasible to visualize all of them.
in Fig. 8 shows an example of a complex scene with an ambiguous Hence only a fixed sized grid of size 200  200 is chosen and im-
salient object. The state-of-the-art methods struggle in completely age thumbnails of size 50  50 are used for visualization purposes.
highlighting the salient regions and include unwanted background The arrangement of images is subject to the dissimilarities be-
noise as in columns 5 and 6. The proposed method performs better tween their feature distances in encoded feature space. Fig. 9
than the state-of-the-art in capturing the salient region, while sup- shows the image groups formed by the representative rules, while
pressing the unwanted background noise. Fig. 10 shows the corresponding saliency maps. The images are
placed into groups based on their feature composition and ar-
6.2. Analysis of evolved rules ranged in the embedded image according to their mutual dis-
tances in encoded feature space. The corresponding saliency maps
To further analyze the niching scheme of the evolved solutions, are computed by applying the corresponding classifier weights.
we performed an experiment on 500 test images from the MSRA As the selected classifiers are highly general and experienced,
dataset. Three highly experienced representative rules were multiple images are covered by more than an individual classifier as
M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120 119

1 seven state-of-the-art combinatorial methods, in terms of precision–


recall curves and F-measures. Further, it is observed that the proposed
0.9
method preserves more details of objects than the state-of-the-art
0.8 methods, which may benefit object recognition applications. Future
work includes the investigation of using more rich features (such as
0.7
computed in deep neural networks) and experimenting the LCS
0.6 technique on image classification benchmarks.

0.5

0.4 Acknowledgments

0.3 This work was supported in part by the Marsden Fund of the
0.2 New Zealand Government under contract VUW1209, admini-
strated by the Royal Society of New Zealand.
0.1 Average distance (Rule1)
Average distance (Rule2)
Average distance (Rule3)
0
f0 f1 f2 f3 f4 f5 f6 f7 f8 References
Feature number
[1] T. Liu, Z. Yuan, J. Sun, J. Wang, N. Zheng, X. Tang, H.-Y. Shum, Learning to detect
Fig. 11. Mean distance for images covered by an individual rule.
a salient object, IEEE Trans. Pattern Anal. Mach. Intell. 33 (2) (2011) 353–367.
[2] Y.-Z. Song, X. Bai, P.M. Hall, L. Wang, In search of perceptually salient group-
can be observed in Fig. 9. It is noteworthy that each group en- ings, IEEE Trans. Image Process. 20 (4) (2011) 935–947.
[3] S.S. Naqvi, W.N. Browne, C. Hollitt, Salient object detection via spectral mat-
compasses a variety of images belonging to different types. This is due ting, Pattern Recognit. 51 (2016) 209–224.
to the fact that images are grouped based on a variety of extracted [4] R. Kimchi, M.A. Peterson, Figure-ground segmentation can occur without at-
features where each feature captures a different property of the image. tention, Psychol. Sci. 19 (7) (2008) 660–668.
[5] S. Liu, X. Bai, Discriminative features for image classification and retrieval,
The grouped saliency response in Fig. 10 elaborates the niching
Pattern Recognit. Lett. 33 (6) (2012) 744–751.
property of the system as different characteristics saliency responses [6] H. Zhang, X. Bai, J. Zhou, J. Cheng, H. Zhao, Object detection via structural
are covered by different rules. In general group 2 includes saliency feature selection and shape model, IEEE Trans. Image Process. 22 (12) (2013)
outputs where high saliency is uniformly assigned inside object con- 4984–4995.
[7] H. Jiang, J. Wang, Z. Yuan, Y. Wu, N. Zheng, S. Li, Salient object detection: a dis-
tours as can be observed by the red highlighted cases. Groups 1 and criminative regional feature integration approach, in: Proceedings of the IEEE
3 feature noisy saliency response where the background confusion Conference on Computer Vision and Pattern Recognition, 2013, pp. 2083–2090.
included in both cases differs in its characteristics. The red squares [8] M.-M. Cheng, G.-X. Zhang, N.J. Mitra, X. Huang, S.-M. Hu, Global contrast based
salient region detection, in: Proceedings of IEEE Conference on Computer
show particularly clear examples for each group. It can be observed Vision and Pattern Recognition, 2011, pp. 409–416.
that the background clutter included by group 1 saliency outputs [9] L. Shao, M. Brady, Specific object retrieval based on salient regions, Pattern
appears to capture valid segments of the image belonging to the Recognit. 39 (10) (2006) 1932–1948.
[10] X. Yang, X. Qian, T. Mei, Learning salient visual word for scalable mobile image
background. Conversely, the background noise included in group retrieval, Pattern Recognit. 48 (10) (2015) 3093–3101.
3 saliency outputs is scattered or dispersed. [11] J. Wang, L. Quan, J. Sun, X. Tang, H.-Y. Shum, Picture Collage, in: Proceedings of IEEE
The saliency outputs enclosed in black boxes belong to images Conference on Computer Vision and Pattern Recognition, 2006, pp. 347–354.
[12] S. Goferman, A. Tal, L. Zelnik-Manor, Puzzle-like collage, Comput. Graph.
covered by more than one rule. The different saliency response Forum 29 (2) (2010) 459–468.
generated by different rules can be easily observed for each du- [13] Y.-C. Chen, V.M. Patel, R. Chellappa, P.J. Phillips, Salient views and view-de-
plicated image, e.g. the suppressed background saliency response pendent dictionaries for object recognition, Pattern Recognit. 48 (10) (2015)
3053–3066.
by the group 3 rule for the hand image as compared with the
[14] L. Itti, Automatic foveation for video compression using a neurobiological model of
saliency output in group 1 is noteworthy. Conversely, groups 2 and visual attention, IEEE Trans. Image Process. 13 (10) (2004) 1304–1318.
3 generate similar saliency outputs for the cycle image. [15] D.A. Klein, S. Frintrop, Center-surround divergence of feature statistics for
salient object detection, in: Proceedings of International Conference on
Fig. 11 shows the average results for the feature distances of images
Computer Vision, 2011, pp. 2214–2219.
covered by an individual rule. This result depicts an important prop- [16] S. Goferman, L. Zelnik-Manor, A. Tal, Context-aware saliency detection, IEEE
erty of the niching scheme of the proposed system. It can be observed Trans. Pattern Anal. Mach. Intell. 34 (10) (2012) 1915–1926.
that the features f 4 − f 8 appear to be similar for all the images and [17] T. Judd, K. Ehinger, F. Durand, A. Torralba, Learning to predict where humans
look, in: Proceedings of International Conference on Computer Vision, 2009,
these distinct rules are created to accommodate the varying nature of pp. 2106–2113.
features f 0 − f 3 for these images. It is also noted that the influence of [18] A. Borji, Boosting bottom-up and top-down visual features for saliency esti-
f 4 and f 5 has been greatly reduced by the learnt weighting, i.e. ef- mation, in: Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition, 2012, pp. 438–445.
fectively a “don't care”, showing the system can overcome the inclu- [19] Q. Zhao, C. Koch, Learning a saliency map using fixated locations in natural
sion of a limited number of redundant or irrelevant features. scenes, J. Vis. 11 (3) (2011) 1–15.
[20] Q. Zhao, C. Koch, Learning visual saliency by combining feature maps in a
nonlinear manner using adaboost, J. Vis. 12 (6) (2012) 1–15.
[21] J.H. Holland, L.B. Booker, M. Colombetti, M. Dorigo, D.E. Goldberg, S. Forrest, R.
7. Conclusions L. Riolo, R.E. Smith, P.L. Lanzi, W. Stolzmann, S.W. Wilson, What is a learning
classifier system? in: Learning Classifier Systems, From Foundations to Ap-
plications, Springer, Berlin Heidelberg, 2000, pp. 3–32.
The goal of this study was to investigate the effectiveness of
[22] L. Bull, T. Kovacs, Foundations of Learning Classifier Systems: An Introduction,
learning classifier systems in combining different image features to Springer, Berlin Heidelberg, 2005.
detect salient objects from different complexity-level images. This goal [23] M. Iqbal, S.S. Naqvi, W.N. Browne, C. Hollitt, M. Zhang, Salient Object Detection
Using Learning Classifier Systems that Compute Action Mappings, in: Proceedings
was successfully achieved by incorporating a novel encoding scheme
of the Genetic and Evolutionary Computation Conference, 2014, pp. 525–532.
in a learning classifier system that computes actions using a linear [24] L. Bull, Applications of Learning Classifier Systems, Springer, Berlin Heidelberg,
combination of features. The proposed approach effectively learned 2004.
different feature combinations for various types of images with dif- [25] K. Shafi, T. Kovacs, H.A. Abbass, W. Zhu, Intrusion detection with evolutionary
learning classifier systems, Nat. Comput. 8 (1) (2009) 3–27.
ferent difficulty levels. The obtained results indicate that the proposed [26] I. Kukenys, W.N. Browne, M. Zhang, Transparent, online image pattern clas-
approach outperforms nine individual feature based methods and sification using a learning classifier system, in: Applications of Evolutionary
120 M. Iqbal et al. / Pattern Recognition 60 (2016) 106–120

Computation, Lecture Notes in Computer Science, vol. 6624, Springer, Berlin, Process. 22 (10) (2013) 3766–3778.
Heidelberg, 2011, pp. 183–193. [46] B. Alexe, T. Deselaers, V. Ferrari, What is an Object? in: Proceedings of IEEE
[27] M. Behdad, L. Barone, T. French, M. Bennamoun, On XCSR for electronic fraud Conference on Computer Vision and Pattern Recognition, 2010, pp. 73–80.
detection, Evolut. Intell. 5 (2) (2012) 139–150. [47] S.S. Naqvi, W.N. Browne, C. Hollitt, Combining object-based local and global
[28] M. Iqbal, W.N. Browne, M. Zhang, XCSR with computed continuous action, in: feature statistics for salient object Search, in: Proceedings of International
Proceedings of the Australasian Joint Conference on Artificial Intelligence, Conference on Image and Vision Computing New Zealand, 2013, p. 6.
2012, pp. 350–361. [48] X. Hou, L. Zhang, Saliency detection: a spectral residual approach, in: Pro-
[29] M. Iqbal, W.N. Browne, M. Zhang, Learning complex, overlapping and niche ceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
imbalance boolean problems using XCS-based classifier systems, Evolut. Intell. 2007, pp. 1–8.
6 (2) (2013) 73–91. [49] Y. Rubner, C. Tomasi, L.J. Guibas, The Earth Mover's distance as a metric for
[30] M. Iqbal, W.N. Browne, M. Zhang, Reusing building blocks of extracted image retrieval, Int. J. Comput. Vis. 40 (2) (2000) 99–121.
knowledge to solve complex, large-scale boolean problems, IEEE Trans. Evolut. [50] O. Pele, M. Werman, Fast and robust Earth Mover's distances, in: Proceedings
Comput. 18 (4) (2014) 465–480. of International Conference on Computer Vision, 2009, pp. 460–467.
[31] M. Iqbal, W.N. Browne, M. Zhang, Extending XCS with cyclic graphs for scal- [51] P.L. Lanzi, D. Loiacono, S.W. Wilson, D.E. Goldberg, Generalization in the XCSF
ability on complex boolean problems, Evolut. Comput. (2015) http://dx.doi. classifier system: analysis, improvement, and extension, Evolut. Comput. 15
org/10.1162/EVCO_a_00167. (2) (2007) 133–168.
[32] S.W. Wilson, Classifier fitness based on accuracy, Evolut. Comput. 3 (2) (1995) [52] A. Borji, D.N. Sihite, L. Itti, Salient object detection: a benchmark, in: Proceedings of
149–175. the European Conference on Computer Vision, Part II, 2012, pp. 414–429.
[33] T. Kovacs, Evolving Optimal Populations with XCS Classifier Systems, Technical [53] S. Alpert, M. Galun, R. Basri, A. Brandt, Image segmentation by probabilistic
Report CSR-96-17 and CSRP-9617, University of Birmingham, UK, 1996.
bottom-up aggregation and cue integration, in: Proceedings of the IEEE
[34] S.W. Wilson, Generalization in the XCS Classifier System, in: Proceedings of
Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–8.
the Genetic Programming Conference, 1998, pp. 665–674.
[54] R. Achanta, S. Hemami, F. Estrada, S. Süsstrunk, Frequency-tuned salient re-
[35] M.V. Butz, S.W. Wilson, An algorithmic description of XCS, Soft Comput. 6 (3–
gion detection, in: Proceedings of IEEE Conference on Computer Vision and
4) (2002) 144–153.
Pattern Recognition, 2009, pp. 1597–1604.
[36] S.S. Naqvi, W.N. Browne, C. Hollitt, Optimizing visual attention models for
[55] K. Boyd, K.H. Eng, C.D. Page, Area under the precision–recall curve: point es-
predicting human fixations using genetic algorithms, in: Proceedings of IEEE
timates and confidence intervals, in: Machine Learning and Knowledge Dis-
Congress on Evolutionary Computation, 2013, pp. 1302–1309.
covery in Databases, 2013, pp. 451–466.
[37] L. Mai, F. Liu, Comparing salient object detection results without ground truth, in:
[56] X. Li, H. Lu, L. Zhang, X. Ruan, M.-H. Yang, Saliency detection via dense and
Proceedings of the European Conference on Computer Vision, 2014, pp. 76–91.
[38] N. Tong, H. Lu, Y. Zhang, X. Ruan, Salient object detection via global and local sparse reconstruction, in: IEEE International Conference on Computer Vision,
cues, Pattern Recognit. 48 (10) (2015) 3258–3267. 2013, pp. 2976–2983.
[39] N. Singh, R. Arya, R. Agrawal, A. Novel, Approach to combine features for [57] C. Yang, L. Zhang, H. Lu, X. Ruan, M.-H. Yang, Saliency detection via graph-
salient object detection using constrained particle swarm optimization, Pat- based manifold ranking, in: Proceedings of the IEEE Conference on Computer
tern Recognit. 47 (4) (2014) 1731–1739. Vision and Pattern Recognition, 2013, pp. 3166–3173.
[40] J. Yua, D. Taob, Y. Ruic, J. Cheng, Pairwise constraints based multiview features [58] Y. Xie, H. Lu, M.-H. Yang, Bayesian saliency via low and mid level cues, IEEE
fusion for scene classification, Pattern Recognit. 46 (2) (2013) 483–496. Trans. Image Process. 22 (5) (2013) 1689–1698.
[41] J. Yu, Y. Rui, D. Tao, Click prediction for web image reranking using multimodal [59] H. Fu, X. Cao, Z. Tu, Cluster-based co-saliency detection, IEEE Trans. Image
sparse coding, IEEE Trans. Image Process. 23 (5) (2014) 2019–2032. Process. 22 (10) (2013) 3766–3778.
[42] T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data [60] X. Shen, Y. Wu, A unified approach to salient object detection via low rank
Mining, Inference, and Prediction, Springer, New York, 2009. matrix recovery, in: Proceedings of the IEEE Conference on Computer Vision
[43] P.L. Lanzi, D. Loiacono, Classifier systems that compute action mappings, in: and Pattern Recognition, 2012, pp. 853–860.
Proceedings of the Genetic and Evolutionary Computation Conference, 2007, [61] M.-M. Cheng, J. Warrell, W.-Y. Lin, S. Zheng, V. Vineet, N. Crook, Efficient
pp. 1822–1829. salient region detection with soft image abstraction, in: IEEE International
[44] T. Liu, Z. Yuan, J. Sun, J. Wang, N. Zheng, X. Tang, H.-Y. Shum, Learning to detect Conference on Computer Vision, 2013, pp. 1529–1536.
a salient object, IEEE Trans. Pattern Anal. Mach. Intell. 33 (2) (2011) 353–367. [62] Q. Yan, L. Xu, J. Shi, J. Jia, Hierarchical saliency detection, in: Proceedings of the IEEE
[45] H. Fu, X. Cao, Z. Tu, Cluster-based co-saliency detection, IEEE Trans. Image Conference on Computer Vision and Pattern Recognition, 2013, pp. 1155–1162.

Muhammad Iqbal completed his PhD in Learning Classifier Systems at the School of Engineering and Computer Science, Victoria University of Wellington (VUW), New
Zealand. He is currently working as a postdoctoral research fellow at VUW, New Zealand. Iqbal's main research interests are in the area of evolutionary machine learning. His
research focuses on evolutionary image analysis and classification using transfer learning in genetic programming and learning classifier systems techniques. He is also
interested in medical image analysis, data mining, and scalability of evolutionary techniques.

Syed S. Naqvi has recently completed his PhD in Salient Object Detection at the School of Engineering and Computer Science, Victoria University of Wellington (VUW), New
Zealand. His research interests are in the area of visual saliency for generic object detection.

Will N. Browne is an Associate Professor at the School of Engineering and Computer Science, Victoria University of Wellington (VUW), New Zealand. His main area of
research is Applied Cognitive Systems. Essentially, how to use inspiration from natural intelligence to enable computers/machines/robots to behave usefully. This includes:
Cognitive Robotics, Learning Classifier Systems (a branch of evolutionary computation) and Modern Heuristics for industrial application.

Christopher Hollitt completed a BE(Hon) in Electrical and Electronic Engineering and a BSc(Hon) in Physics and Theoretical Physics at the University of Adelaide in 1994 and
1996 respectively. In 2007 he received a PhD from the same institution, having studied the control of high precision optomechanical systems for gravitational wave detection.
His current research work is in machine perception, image processing and robot control. The work spans a wide area, from fundamental problems in feature recognition,
through techniques for efficiently utilizing the limited sensory resources of a robot system, to high level applications of image processing.
Dr Hollitt is a senior lecturer in the School of Engineering and Computer Science at the Victoria University of Wellington.

Mengjie Zhang is a Professor of Computer Science at the School of Engineering and Computer Science, Victoria University of Wellington (VUW), New Zealand. His research is
mainly focused on evolutionary computation, particularly genetic programming, particle swarm optimization and learning classifier systems with application areas of image
analysis, multi-objective optimization, classification with unbalanced data, feature selection and reduction, and job shop scheduling. He has published over 400 academic
papers in refereed international journals and conferences. He has been serving as an associated editor or editorial board member for five international journals (including
IEEE Transactions on Evolutionary Computation and the Evolutionary Computation Journal) and as a reviewer of over fifteen international journals. He has been serving as a
steering committee member and a program committee member for over eighty international conferences.

You might also like