Professional Documents
Culture Documents
An Unsupervised Game-Theoretic Approach To Saliency Detection
An Unsupervised Game-Theoretic Approach To Saliency Detection
XX, XX 1
we define one specific payoff function for each superpixel background cues via graph-based manifold ranking. Saliency
which incorporates multiple cues including spatial position value of each image element was determined based on its
prior, objectness prior, and support from others. Adopting two relevance to given seeds. In [14], saliency pattern was mined
independent priors makes the proposed method more robust to find foreground seeds according to prior maps. Foreground
since when one prior is inappropriate, the other might work. labels were propagated to unlabeled regions. Tong et al. . [27]
The goal of the proposed Saliency Game is to maximize proposed a learning algorithm to bootstrap training samples
the payoff of each player given other players strategies. This generated from prior maps. These methods exploited either
can be regarded as maximizing many competing objective boundary background prior or foreground prior from a prior
functions simultaneously. The game equilibrium automatically map, while we adopt two different priors in our method for
provides a trade-off, so that when some image region can not robustness purposes. Priors only act as weak guidance with
be assigned a right saliency value by optimizing one objective very small weights in the payoff function of our proposed
function (e.g., due to misleading prior), optimization of the Saliency Game.
other objective functions might help to give them a right Some deep learning based saliency detection methods have
saliency value. This approach seems very natural for attention achieved great performance. In [30], two deep neural networks
modeling and saliency detection, as also features and objects were trained, one to extract local features and the other to
compete to capture our attention. conduct a global search. Zhao et al. . [37] proposed a multi-
In addition, it is known that one main factor for the aston- context deep neural network taking both global and local
ishing success of deep neural networks is their powerful ability context into consideration. Li et al. . [18] explored high-
to learn high-level semantically-rich features. Using features quality visual features extracted from deep neural networks
extracted from a pre-trained CNN to build an unsupervised to improve the accuracy of saliency detection. In [17], high
method seems a considerable option, as it allows utilizing level deep features and low level handcrafted features were
the aforementioned strength while avoiding time-consuming integrated in a unified deep learning framework for saliency
training. However, rich semantic information comes with the detection. All above methods needed a lot of time and many
cost of diluting image features through convolution and pool- images for training. In this work, we are not against the CNN
ing layers. Due to this, we also use traditional color features models, but we combine deep features with traditional color
as supplementary information. To make full use of these two features in an unsupervised way, which result in an efficient
complementary features for better detection results, we avoid unsupervised method complementary to CNNs that does on
taking the weighted sum of the raw results generated by the par with the above models that need labeled training data.
above Saliency Game in the two feature spaces. Instead, we Hopefully, this will encourage new models that can utilize
further propose an Iterative Random Walk algorithm across both labeled and unlabeled data.
two feature spaces, deep features and the traditional CIE-Lab Furthermore, there are many computer vision and learning
color features, to refine saliency maps. In every iteration of tasks in which game theory has been applied successfully.
the Iterative Random Walk, the propagation in the two feature A grouping game among data points was proposed in [28].
spaces are penalized by the latest output of each other. Figure 2 Albarelli et al. . [3] proposed a non-cooperative game between
shows the pipeline of our algorithm. two sets of objects to be matched. A game between a region-
In a nutshell, the main contributions of our work include: based segmentation model and a boundary-based segmentation
1) We propose a novel unsupervised Saliency Game to model was proposed in [6] to integrate two sub-modules.
detect salient objects. Adopting two independent priors Erdem et al. [9] formulated a multi-player game for trans-
improves robustness. The nature of game equilibria duction learning, whereby equilibria correspond to consistent
assures accuracy when both priors are unsatisfactory, labeling of the data. In [23], Miller et al. showed that the
2) Utilizing semantically-rich features extracted from pre- relaxation labeling problem [11] is equivalent to finding Nash
trained fully convolutional networks (FCNs) [22], in equilibria for polymatrix n-person games. However, to the best
an unsupervised manner, the proposed method is able of our knowledge, game theory has not yet been used for
to identify salient objects in complex scenes, where salient object detection.
traditional methods that use handcrafted features may
fail (see Figure I(c)), and III. D EFINITIONS AND SYMBOLS
3) An Iterative Random Walk algorithm across two feature
spaces is proposed that takes advantage of the comple- In this section, we introduce definitions and symbols that
mentary relationship between the color feature space and will be used throughout the paper.
the deep feature space to further refine the results. Superpixels: In our model, the processing units are superpix-
els segmented from an input image by the SLIC algorithm [2].
I = {1, 2, 3, ..., N } denotes the enumeration of the set of
II. R ELATED WORK superpixels. Pi (x, y), i ∈ I, denotes the mask of the i-th
Some saliency works have followed an unsupervised ap- superpixel, where Pi (x, y) = 1 indicates that the pixel located
proach. In [12], saliency of each region was defined as its at (x, y) of the input image belongs to the i-th superpixel, and
absorbed time from boundary nodes, which measures its Pi (x, y) = 0, otherwise.
global similarity with all boundary regions. Yang et al. . [36] Features: We use FCN-32s [22] features due to its great
ranked the similarity of image regions with foreground or success in semantic segmentation. We choose the output of
SUBMITTED TO IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 3
MetricFusion
Metric Fuse
(g) (h)
(j) (i)
(a)
Color Feature
(c) (e)
Penalize
(b)
Multiscale
Input Superpixels (f) Multiscale
Output
Iterate outputs
(d)
Deep Feature
Iterative
Saliency Game
Random Walk
Fig. 2. The pipeline of our algorithm. The input image is segmented into superpixels in several scales. Th following process is applied to
each scale, and the average of the results in all scales is taken as the final saliency map. First, the Saliency Game among superpixels is
formulated in two feature spaces ((a) and (b)), respectively to generate corresponding results ((c) and (d)). Second, an Iterative Random
Walk is constructed to refine the results of the Saliency Game. Then outputs (e) and (f ) are summed (with weights) as the result in one scale.
In the Iterative Random Walk, complete affinity metric ((g) in deep feature space and (h) in color space) in one feature space is fused with
neighboring affinity metric ((i) in color space and (j) in deep space) in another feature space. This is done to exploit the complementary
relationship between the two feature spaces. Finally, we average the results in all scales to form the final saliency map.
the Conv5 layer as feature maps for the input image. This is the i-th superpixel. N2 (i) and N3 (i) are defined as follows:
because features in the last layers of CNNs encode semantic N2 (i) = {j|j ∈ N1 (k), k ∈ N1 (i), j 6= i}, (3)
abstraction of objects and are robust to appearance variations. (
∅ if i ∈
/B
Since the feature maps and the image are not of the same N3 (i) = , (4)
resolution, we resize the feature maps to the input image size {j|j ∈ B, j 6= i} if i ∈ B
via linear interpolation. We denote the affinity between the where B denotes the set of superpixels in image boundary.
i-th superpixel and the j-th superpixel in deep feature space
as Ad (i, j), which is defined to be their Gaussian weighted
Euclidean distance: IV. S ALIENCY GAME
f − f d
2 Here, we formulate a non-cooperative game among super-
d
d i j
A (i, j) = exp − , (1) pixels to detect salient objects in an input image. The input
σ2
where fi is the deep feature vector of superpixel i. Each
d image is firstly segmented into N superpixels which act as
superpixel is represented by the mean deep feature vector of players in the Saliency Game. Each player chooses to be
all its contained pixels. ”background” or ”foreground” as its pure strategy and its
mixed strategy corresponds to this superpixel’s saliency value.
After showing their strategies, players obtain some payoff
The semantically-rich deep features can help accurately according to both their own and other players’ strategies.
locate the targets but fail to describe the low-level information. Payoff is determined by a payoff function which incorporates
Therefore, we also employ color features as a complement position and objectness cues as well as support from others.
to deep features. Inspired by [12], we use CIE-Lab color We use each player’s mixed strategy in the Nash equilibrium
histograms to describe superpixels’ color appearance. With of the proposed Saliency Game as the saliency value of this
CIE-Lab color space divided into 83 ranges, the color feature superpixel in the output saliency map. Such an equilibrium
vector of the i-th superpixel is denoted as fic . Affinity between corresponds to a steady state where each player plays a
superpixels i and j in the color feature space is denoted as strategy that maximize its own payoff when the remaining
Ac (i, j), which is defined to be their Gauss weighted Chi- players’ strategies are kept fixed, which provides a globally
square distance: plausible saliency detection result.
χ2 (fic , fjc )
Ac (i, j) = exp − . (2)
σ2 A. Game setting
Neighbor: We adopt a definition of 2-hoop neighbor which
is frequently used in superpixel based saliency detection The pure strategy set is denoted as S = {0, 1}, indicating
methods. The set of the i-th superpixel’s neighbors is denoted ”to be foreground” or ”to be background”, respectively. All
as N (i) = N1 (i) ∪ N2 (i) ∪ N3 (i), where N1 (i) indicates the superpixels’ pure strategies are collectively called a pure
set of superpixels who share at least one common edge with strategy profile, denoted as s = (s1 , ..., sN ). The strategy
SUBMITTED TO IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 4
profile set is denoted as Θ. πij (si , sj ) denotes a single payoff superpixel. Hence, here we focus on modeling payoff of every
that superpixel i obtains, when playing pure strategy si against 2-person game. We define the payoff πij (si , sj ) of superpixel
superpixel j who holds a pure strategy sj , in their 2-person i for its 2-person game with j as a weighted sum of three
game. There are four possible values for πij (si , sj ) that can terms:
be put into a 2 × 2 matrix Bij , πij (si , sj ) = λ1 · posi (si ) + λ2 · obji (si ) + sptij (si , sj ), (8)
πij (0, 0) πij (0, 1) where posi (si ), obji (si ), and sptij (si , sj ) indicate the i-th
Bij = . (5)
πij (1, 0) πij (1, 1) superpixel’s position prior, objectness prior and support that
Payoff of superpixel i in pure strategy profile s, where ∀j ∈ I superpixel j gives to superpixel i, respectively. λ1 and λ2 are
the j-th superpixel’s pure strategy sj is the j-th component parameters controlling the weight of the first two terms.
of vector s, is denoted as πi (s). Payoff of superpixel i Position: Position prior term in the payoff function is formu-
when it adopts a pure strategy ti (not necessarily the i- lated based on the observation that salient objects often fall at
th component of s), while all other superpixels adopt pure the image center. The position term should give a greater pay-
strategies in pure strategy profile s is denoted as πi (ti , s−i ). off when, a) Center superpixels choose to be foreground, and
We make an assumption that the total payoff of superpixel i b) Boundary superpixels choose to be background. Assuming
for playing with all others is the summation of payoffs for (x0 , y0 ) to be the image center, and (xi , yi ) to be the center
playing 2-player games with every other P single superpixel. coordinate of superpixel i, the position prior term is defined
Formally, wePassume that πi (s) = j6=i πij (si , sj ) and as follows, (
πi (ti , s−i ) = j6=i πij (ti , sj ). 2 2
1
exp{− (xi −x0 ) +(y i −y0 )
} if si = 1
A pure best reply for player i against a pure strategy profile posi (si ) = 1 N σ
(xi −x0 )2 +(yi −y0 )2
.
s is a pure strategy such that no other pure strategy gives a N (1 − exp{− σ }) if si = 0
higher payoff to i against s. The i-th player’s pure best-reply (9)
correspondence, which maps each pure strategy profile s ∈ Θ Objectness: Generally, objects attract more attention than
to a pure strategy si ∈ S, is denoted as βi : Θ → S: background clutter. Hence superpixels which are part of an
object are more likely to be salient. The objectness term
βi (s) = {si ∈ S|πi (si , s−i ) ≥ πi (ti , s−i ), ∀ti ∈ S}. (6)
should give a greater payoff when, a) Superpixels with high
The combined pure best-reply correspondence β : Θ → Θ is objectness choose to be foreground, and b) Superpixels with
defined as the cartesian product of all players’ pure best-reply low objectness choose to be background.
correspondence:
We exploit the geodestic object proposal (GOP) [15]
β(s) = ×i∈I βi (s) ⊂ Θ. (7) method to extract a set of object segmentations, and define
A pure strategy profile s is a pure Nash equilibrium if s ∈ the objectness of a superpixel according to its overlap with all
β(s). GOP proposals as follows:
A probability distribution over the pure strategy set is
P
PNo Oj (x,y)×Pi (x,y)
1
N ·No j=1
x,yP
Pi (x,y)
, if si = 1
termed as a mixed strategy. Mixed strategy of the i-th su-
x,y
obji (si ) =
perpixel is denoted as a 2-dimensional vector zi = (zi0 , zi1 )T , PNo P
Oj (x,y)×Pi (x,y)
1 (1 − 1 x,yP
), otherwise
while zi0 = P (si = 0), zi1 = P (si = 1) and zi0 + zi1 = 1. N No j=1 x,y Pi (x,y)
strategies of all other superpixels and takes a stance by pro- graphs in the color space, respectively.
SUBMITTED TO IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 6
Preci
0.3 0.3
0.3 0.3
1 0.841 1 1
0.841 0.84
1
0.84
0.5
0.9 0.9
0.2 0.2
0.2 0.2
0.9 0.4 0 0.8 0.84
0.2 0.4 0.6
0.9
0.8
0.8 1 00
0.9
0.2
0.2 0.4
0.4 0.6
0.6 0.8
0.8 11
0.84
0
0.9
0.2 0.4 0.6 0.8 1
Precision
Precision
0.3 0.81 0.6 0.6
0.8
1
10.8 0.2 0.5 10.8 0.5
0.8
0.82
0.8
0.8 1 0 0.2 0.4 0.82
0.6 0.8 1 0.8 0.8
0.9
C(s1) 0.4 0.4
C(s1) 0.8
0.9 C(s1)
0.8 Recall C(s3)
C(s2) C(s4)0.9 C D CD(AVE) CD(IRW*) CD(IRW) MR MCDL
0.7 C(s2)
1
0.9 0.3 0.7 0.3 C(s2)
1 0.7 0.84 0.7
F-measure
0.84
F-measure
F-measure
Precision
Precision
Precision
Precision
F-measure
0.8
0.7
C(s3)
0.8 0.2
0 0.2 0.4 0.6 0.8 0.8 1
0.2
0
0.7
C(s3)
0.2 0.4 0.6 0.8 1 0.78 0.8 0.78
0.6
0.9
C(s4) 0.8 Recall
0.6
0.9
C(s4) Recall0.78 0.82 0.82
0.80.78 0.6 0.6
C C
F-measure
0.7 0.7
F-measure
0.7
10.8 0.8
Precision
Precision
Precision
Precision
0.76 0.76 0.8
Precision
D C(s1) D 0.78
0.8
0.60.5 0.60.5
0.9
0.7
CD(AVE)C(s2) 0.7
CD(AVE) 0.5 0.5
F-measure
F-measure
Precision
Precision
0.6
0.8 0.6
0.70.76
C(s3) 0.78 0.6 0.76 0.78
CD(IRW*) C(s4) CD(IRW*) 0.78
0.5
0.6
0.5
0.6 0.74 0.74
0.4
0.7
CD(IRW)CD 0.4 CD(IRW) 0.4 0.76 0.4 0.76
Precision
Precision
0.5
MR
0.60.5 CD(AVE)
0.6 MR
0.5
0.76 RFCN RFCN
0.4 CD(IRW*)
0.74 0.4 0.74 0.74
0.5 MCDL 0.76 MCDL
0.3 0.4 CD(IRW)
0.3 0.4
0.3
0.72
0.3
0.72 Ours RFCN
Ours
0.5
DRFI
0.4 0.4
MR 0.5
DRFI 0.74 0.74
0.3 MCDL 0.3 0.72 Ours
0.3 0.5
DRFI 0.3
0.20.2
0.3
Precision
Random Walk. not show its performance over the PASCAL-S dataset. Visual
0.6 In addition, as an unsupervised
0.6 approach, our method is comparison of the proposed method against state-of-the-art on
0.2
economical and practical. Although
0 0.1 0.2 0.3
someRecall
0.4
deep
0.5
learning
0.6
based
0.7
different
0.8
datasets1 is shown in Figure 12, 13, 14, 15.
0.9
0.6 SUBMITTED
0.7 0.6 TO IEEE TRANSACTIONS ON
0.6 IMAGE PROCESSING, VOL. XX,
0.6 NO. XX, XX 0.6 0.7 0.6 0.6 8
F-measure
0.6
F-measure
F-measure
F-measure
0.6
F-measure
F-measure
0.8 0.8 0.8
F-measure
F-measure
F-measure
F-measure
0.4 0.4
0.4 0.4 0.4
0.2 0.2 0.2 0.2 0.2 0.2 0.2
0.3 0.3 0.3
0.3 0.3 Saturday, M
0.1 0.2
0.1 0.1 0.2 0.1 0.1 0.2 0.1 0.1
Saturday, March 11, 2017 9:53 AM Saturday, March 11, 2017 9:53 AM Saturday, March 11, 2017
Saturday, March 11, 2017
9:53 AM
9:53 AM
0.1
(×"#$%
0.1 0.1
Saturday, March 11, 2017 9:53 AM
0.2 0.2
0 0 0 0 0 0 0 0$% )
(×"#
.004 0.009
8 0.005 0.006
0.010
0.001 0.002 0.0070.001 0.008
0.003 0.0030.009
0.004
0.021
0.002 0.005
0.024
0.004 0.010
0.006
0.046
0.005 0.006 0.007
0.068
0.007 0.008
0.008 0.090 0.021
0.009 0.009
0.112
0.010 0.024 0.010
0.1340.046 0.024 0.068
0.021 0.156 0.178
0.046 0.068
0
0.0900.0210.112
0.200
0.090 (×"# $% )0.134
0.112 0.024
0.134 0.046
0.156 0.090
0.156
0.178 0.068
0.200 0.108
0.178
(×"#
0
$% ) 0.090 0.207
0.200
0.090 0.112 $%
0.108 0.306
(×"# 0.134
0.207 0.405
) 0.306 0.090
0.156
0.405 0.504
0.504 0.108
0.603 0.603
0.178 0.207
0.200
0.702 0.702
(×"#
0.801 0.306
$%
0.900 ( ×"# $' ) 0.405
) 0.801 0.900
0.0900.504
( ×"# $'0)
0.108
alpha lambda_2
alpha 0.1 alpha lambda_1 lambda_1
lambda_1
( ×"#$' )
0.1 lambda_1
( ×"#$' )
lambda_2 lambda_2
( ×"#$' ) ( ×"#$' )
Saturday, March 11, 2017 9:53 AM
Fig. 7. Sensitivity of the Saliency Game to parameters α, λ1 and λ2 , evaluated on ECSSD dataset.
0 0
009 0.010 0.021 0.024 0.046 0.068 0.090 0.112 0.134 0.156 0.178 0.200 (×"#$% ) 0.090 0.108 0.207 0.306 0.405 0.504 0.603 0.702
dataset BL DSR HS lambda_1
MR RC wCO lambda_2
ECSSD 0.9143 0.8619 0.8838 0.8827 0.8342 0.8814 use MR [36] model to compute priori , ∀i ∈ I. We try four
( ×"#$' )
PASCAL-S 0.8671 0.8118 0.8362 0.8259 0.8139 0.8482 different initial states to test whether inducing prior knowledge
MSRA 0.9535 0.9382 0.9279 0.9267 0.8951 0.9360
HKU-IS 0.9140 0.9008 0.8782 0.8611 0.8530 0.8952 into the initial state leads to a better saliency detection result.
SOD 0.8503 0.8208 0.8145 0.7903 0.7924 0.8026 Each of the five different Nash equilibria corresponds to a
DUT-OMRON 0.8778 0.8787 0.8586 0.8447 0.8476 0.8846
BSCA DRFI MCDL LEGS KSR Ours saliency detection result. We show the quantitative comparison
ECSSD 0.9176 0.9404 0.9186 0.9235 0.9268 0.9272 of the five different results in terms of F-measure curves and
PASCAL-S 0.8665 0.8950 0.8699 - 0.9012 0.8724
MSRA 0.9484 - - - - 0.9583 PR curves in Figure VI-F. It can be seen that the initial state
HKU-IS 0.9140 0.9435 0.9175 0.9026 0.9099 0.9183 V half without any prior knowledge, which is adopted in the
SOD 0.8358 0.8813
0.9 0.8163 0.8268 0.8403 0.8481
0.9
DUT-OMRON 0.8779 0.9157 0.9014
0.9
0.8841 0.8921 0.8869 paper, leads to the best saliency detection.
0.8
0.8
TABLE II 0.8
1 0.8
AUC scores. Supervised
0.7 methods are marked in bold.
0.7 0.8 0.7 0.8 0.9
0.7
0.8
0.6 0.6 0.6 0.7
0.7 0.6
0.7
F-Measure
F-measure
Precision
F-measure
0.9
F-measure
0.9
V pos
F-measure
0.6 0.6
0.3
V obj
F-measure
0.3
F-measure
0.3 0.5
0.3 0.3 0.4
0.4 0.5
V prior
0.2 0.2
0.4 0.4 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.2 0.4 0.6 0.8 1
0.2 0.2 0.2 0.3 Recall Threshold
0.3
0.3 0.3
0.1 0.2
0.2
0.1 0.2
0.1 Fig. 9. Comparison of different saliency detection results corresponding to
0.2
0.1 0.1 five different Nash equilibria (evaluated on ECSSD dataset). V half , V 9:53 AM
Saturday, March 11, 2017
pos ,
0
0.1
0
0
0
0 0.1 V obj , V bd and V prior : Saliency detection results corresponding to five Nash
6 0.7 0.8 0.9
0.1 1.0
0.6 1.1 0.6 0.8
0.7 1.2 1.30.9 1.0
0.9 1.4 1.1 1.2 1.30.1 1.40.2 0.1 0.3 0.40.1 0.50.2 0.6
0.3 0.7
0.4 0.8
0.5 0.9
0.69:53 AM0.7 0.8 0.9
0.1 0.7 0.8 1.0
beta
1.1 1.2 1.3 1.4 0.2 0.3 0.4
rho_1
Saturday, March 11, 2017
0.5 0.6 0.7
equilibria
0.8
reached
0.9
by Replicator Dynamics starting from initial state V half ,
beta beta rho_1 rho_1 V pos , V obj , V bd and V prior , respectively. (×"#$% )
0 0
8 0.009 0.010 0.021 0.024 0.046 0.068 0.090 0.112 0.134 0.156 0.178 0.200 (×"#$% ) 0.090 0.108 0.207 0.306 0.405 0.504 0.603 0.702 0.801 0.900 ( ×"#$' )
Fig. 8. Sensitivity of the Iterativelambda_1
Random Walk to parameters β and ρ2 , lambda_2
( ×"#$' )
evaluated on ECSSD dataset.
VII. S UMMARY AND C ONCLUSION
to other four Nash equilibria, reached by Replicator Dynamics We propose a novel saliency detection algorithm. Firstly, we
starting from four different interior points of ∆. We denote formulate a Saliency Game among superpixels, and a saliency
the initial state used in the paper as V half , and the other four map is generated according to each region’s strategy in the
initial states as V bd , V pos , V obj , and V prior . Each of them Nash equilibrium of the proposed Saliency Game. Secondly,
is a 2 × N matrix, where the i-th column vector (denoted as an iterative random walk that combines a deep feature and
vibd , vipos , viobj , viprior , respectively) corresponds to the mixed a color feature is constructed to refine the saliency maps
strategy of superpixel i. Variables vibd , vipos , viobj and viprior generated in the last step. Extensive experiments over four
are set as follows: benchmark datasets demonstrate that the proposed algorithm
achieves favorable performance against state-of-the-art meth-
(
bd,1 0.4 if i ∈ B
vi = , v bd,0 = 1 − vibd,1 ; (22) ods. The sensitivity analysis shows the robustness of the
0.5 otherwise i
proposed method to parameter changes.
vipos,1 = N · posi (1), vipos,0 = N · posi (0); (23) Different from most previous methods that formulate
saliency detection as minimizing one single energy function,
viobj,1 =N· obji (1), viobj,0 = N · obji (0); (24) the game-theoretic approach can be regarded as maximizing
many competing objective functions simultaneously. The game
viprior,1 = priori , viprior,0 = 1 − priori ; (25)
equilibrium automatically provides a trade-off. This seems
where priori is the saliency value of superpixel i computed very natural for attention modeling and saliency detection, as
by another saliency detection method. In this experiment, we also features and objects compete to capture our attention.
#726
0.8 0.6 0.6 0.6 0.6
Preci
Preci
Preci
Preci
Preci
Preci
0100.7Precision
010
0.5 ICCV 2017 Submission #726. CONFIDENTIAL
0.5
0.5
0.5
REVIEW COPY. DO NOT DISTRIBUTE.
0.5 0.5
0.9
0110.61 IEEE TRANSACTIONS011
Paper ID 726 Paper ID 726
0.4 0.4
SUBMITTED TO0.9 ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 9
0.4 1 1 1
0.4 0.4 0.4
0.3 0.3
0120.90.5 012
0.9 0.9 0.9
0.3 0.3 0.3
0.2 0.2 0.3
756
013 0.4
0.8 BL 0.8BL BSCA DRFI 013
0.8 DSR HS KSR LEGS 0.8 MCDL MR RC wCO 0.8 Ours
1
0.8 BSCA 0.20.1
DRFI DSR 1 HS
0.1 0.2
KSR LEGS MCDL 1 MR
0.2 0.2
RC 1
757
014
1
014
1
0 0.2 0.41 0.6 0.8 1 0 00.7 0.20.2 0.40.4 0.60.6 0.8 0.8 1 1 0 0 0.2 0.2
1
0.4 0.4 0.6 0.6 0.8 0.8 1 1 0 0 0.2 0.2 0.4 0.
1. Performance Comparison 1. Performance Comparison
0.3
0.7 0.9 0.9 0.7 0.7
0.9 0.9
Recall Recall
Precision
Precision
Precision
Precision
0.9 0.6 0.9 Recall 0.9 Recall
Recall 0.9 R
758
0.8
015
1
0.2
0.6
0 0.2
0.8
0.8
0.7 0.5 0.8 0.7 0.8 0.7
Recall
759
016 016
Precision
In the paper, we 0.7 evaluate the proposed In the paper, methodweonevaluate 4 benchmark the proposed 0.7 datasets, and compare
method on 4 0.5benchmark performance 0.7 datase of
Precision
Precision
Precision
0.5 0.7 0.6 0.5
ICCV
0.6 0.6 0.6
0.7 0.4
VICCV ICCV
Precision
Precision
Precision
Precision
0.5
760
017 0.4
with 0.5
11 state-of-the-art 017
methods including
with
0.6
BL [8], BSCA
11 state-of-the-art methods
0.4
[7], DRFI including [4], DSR BL [5], [8], HS BSCA [11],[7],
0.4
0.5
LEGS DRFI [9], [4]M 0.5
#726
0.6 0.3
6#726
0.4 0.6 0.6
0.6
#726 ICCV
0.4 0.5
761
018 018
0.4
RCSubmission [1], wCO [14], #726. and KSR2017 [10]. Here weCOPY. compare andour method 0.5 with Herethem on other our twoDISTRIBUTE.datasets: DU
0.4
762
019
0.2
0
SOD [6].
0.2 0.4
0.2 0.4
0 Performance
0.6
0.2
0.6
0.8
0.4
ICCV
1 019
0.1
0
0.6
2017
comparison Submission
0.2 SOD0.3 0.8 0.4in[6].
1 terms #726.
0.6Performance
0.1 CONFIDENTIAL
of0.8 PR curves 1ICCV
0.2
2017
comparison
0 and 0.2 Submission
F-measure
0.4 REVIEW
0.4 in terms 0.6 COPY.
#726.
curvesof
0.8 PRare DO
CONFIDENTIAL
1 curves
0.2
0.2 0 NOT
shown andDISTRIBU
0.2
in F-meas
0.4 0.4 REV
Figure 0.2
0.5
7631 are0.2 shown in Table 1. 020
0.3
0.9 Visual comparison are shown in
0.2
of Table
0.9
the proposed 1. Visual method
Recall
comparison
0.3
againstofstate-of-the-art
Recall
the proposed method 0.8
on differen
0.3
aga 0.9
7640.9
0.8
021 Figure 7, 8, 9, 10. 021 Figure 7, 8, 9, Recall 10.
0.8 0.1 0.2 0.7 0.2 0.8
05
0 0.2 0.4 0.6 0.8 1
000 Recall
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2
CCV
F-Measure
7650.8
022
1
001 0.9
0.6
0.4 756 022 0.9 1 0.6 0.8 1 0.9
05
1 0.6
F-Measure
F-Measure
F-Measure
757ICCV 20170.7Submission
10.7 1 1 1
002 0.7 1 0.8 1
811 #726. CONFIDENTIAL
0.8
0.8
0.4
0.80.8 05
0.90.60.9REVIEW COPY. DO NOT DISTRIBUTE.
0.4 0.8 0.4 0.4
Precision
812
0.7 0.6 0.6 0.7 0.7 0.7 0.7
0.6 0.2 0.6
Precision
Precision
Precision
025
0.6
025
F-Measure
F-Measure
0.8
F-Measure
F-Measure
05
0.8 0.6 0.6
004
0.6
759
0.5 0.5 0.6 0.50.6 0.6
813 0.7
0.5
0.1 0.1 0.6 0.1 0.1
0.7 0.7 0.5
0.70.4
769
026
0.5
1756
005
0.7 0.7
0.4 0
0261
0.7 0.40.50.7 0 0.7
0.5 0.5 0 0.5
0.4 0.7 0.7
05
0.5 0
F-Measure
Precision
F-Measure
Precision
Supplementary Material
F-Measure
0.4
Precision
Precision
Precision
Precision
Precision
Precision
Precision
Precision
Precision
0 0.2 0.4 0.6 0.8
0.4 1
Supplementary Material
0 0.2 0.4 0.6 0.8 1
814
0 0.2 0.4 0.6 0.8 1
760
1 0.2 1 0.5 0.5 0 0.2 0.4 0.6 0.8 1
0.4 ThresholdBL 0.8 BSCA 1 DRFI 0.6 Threshold DSR HS 1
KSR
Threshold 0.60.3 LEGS 0.4 MCDL Threshold MR 1 RC 0.4
770 0.6 0.3
0.3
027
0.6 0.6
0.3 027
0.40.6
0.3 0.4
0.6 0.4
0.3 0.6 0.6
9757 06
0.3 1 1 1 1
006
0.9 0.9 0.9 0.50.2
815 0.9 0.9 0.4
0.50.2 0.4
761
0.3 0.3 0.9 0.3
0.2 0.2 0.2
0.5 0.30.5 0.3 0.3
771
028
0.5 0.9
028
0.5 0.9 0.9 0.5 0.9 0.5 0.5
06
8758
0.2
1007 0
0.2
0.8 0.2
0.4 0.1
816 0.6
0.2 0.1 0.4 0.6 0.8 0.8
1 0.1 0.4 0.1
0.3 0.2
0.10.3
0 0.2 0.4
0.1 0.6 0.8 0.8 1
0.8 0.8 0 0.2 0.4 0.6 0.8 1
0.4
0.8 Recall 0.6 762
0.8
0.20.4 0 0.2 0.4 0.6 0.8
0.2 1 0 0.2 0.8
0.4 0.2 0.6 0.8 1
submission
0.3 0.7 0.70.1 0 0.2
Precision
0.3 0.3
Precision
Precision
030 030
Precision
Precision
Precision
Threshold
Precision
Precision
Precision
0.5
F-Measure
0.9
Figure 0.5 0.5 0.5
0.9 7. Quantitative SOD comparisons in terms and F-measure curves.
5 0.5 0.4 Recall Recall 0.5
775
032 MSRA 765 032 DUT-OMRON
762 0.6
Paper ID 726 Paper ID 726 06
0.4
011
0.4 0.9
0.4
0.1 0.8 0.8 0.9
820
0.9 0.8 0.4 0.4
0.9 0.9
O4 0.3NOT DISTRIBUTE. 0.4
0.3
0.4 0.4
776
0.4
033
763
012 Fig. 0.3
0.7 10. Quantitative comparison
766 033
of models
0.3 in terms 0.8 of PR curves. 0.8
0.3 0.7 dataset BL DSR 0.3 HS 0.7 0.8 06
0.3 Figure 1. Quantity 821 comparison in terms
Figure of 1. PRQuantitycurves and F-measure
comparison in0.3curves.
terms of PR cu
0.8 0.8 0.2
777
3 0.2 0.3 0.3 ECSSD 0.9143 0.8619 0.8838 0
034 034
0.2
764
0.6 0.2 0.7 0.7
013 0.206
0.7 1 0.7 0.1 0.2 0.2 0.7
767
0 0.2 0.4 0.6 0.8
00.6 0 0.6
BL BSCA DRFI DSR 0
HS 0.2822 KSR
dataset
0.4
BL
0.6
LEGS DSR 0.8
MCDL1
HS PASCAL-S
MR0.2 MR 0.4
dataset RC 0.8671 0.6
RC
BL wCO 0.8 0.8118
wCO
DSR Ours1
0.8362
HS 0
F-Measure
0.8 0.4 0
Recall 0.4
F-Measure
0.5
778 Recall Recall
2 0.1 0.5 0.6 0.2 0.2
035
0.6
1035
1 0.1 0.2 1 1 0.2 1
765
0.6
014 MR0.2 0.2 0.4
0 0
0.6
0.4 0.4 0.6 0.6 0.8 0.8 1 0
0.6
0.2 0.9 0.2 0.4SOD 0.4
810 0.6 0.6 0.5723 0.8 0.8 1 0.59621 0 0 0.5212
0.8 0.5 0.2 SOD MSRA 0.5695
0.4 0.9535
0.4184
0.6 0.60.5723 0.8 0.8 0.9382
0.5987
0.5962 1 0 0.9279
0.5212
0.9 0.2 060.0
Comparison
0 0.2 0.4 1 0.5
0.5 1. Performance Comparison
0768Ours 0.1 0.50.5 823 0.3
0.9
F-Measure
F-Measure F-Measure
F-Measure
F-Measure
F-Measure
F-Measure
0.9
0.9
RCRecall
Recall wCO HKU-IS Recall 0.9140 0.9
0.9008 0.8782
0.9
779
036
766 036
0.8DUT-OMRON Recall Recall 0.4989 0.5243 0.5108
0.7 DUT-OMRON
0.5 Recall
0.5280 0.4058
0.4989 0.5277
0.5243 0.5108 0.5
00
06
0.8
015 0.81
0.30.8
811
824
0.8
Threshold 0.4
BSCA
0.8
DRFI MCDL
0.4
L
0.8
Precision
Precision
Precision
0.3
0.4 0.6
825 0.6 0.3
770 038
0.3 0.6
DUT-OMRON 0.3
0.3
781 PASCAL-S 0.8665 0.8950 0.8699
0.6
V
0.6 0.6
038
768
0.8
0.5091 0.5505 0.6250
0.5
DUT-OMRON 0.5916 0.5911
0.5091 0.5981
0.5505 0.6250 0
F-Measure
07
0.1
017
rt methods 0.2 with including11 state-of-the-artBL [8], BSCA methods [7],0.20.2DRFI including [4], DSRBL [8], [5], BSCA HS [11], [7],LEGS DRFI [9],
[4], MCDL
DSR [5], [13], HSMR [11], - [12],LEGS- [9], M
F-Measure
813
F-Measure
0.5
0.5 0.2 0.2
0.50.5
826 0.2
MSRA 0.9484 0.5 0.5
0.2 0.5
6782 771
0.70 0.4 0.4
039
769 0.20.6
039 07
Precision
nd 018 KSR 0.1[10]. Here we compare our method with them on other two datasets: DUT-OMRON
HKU-IS 0.9140 [12]
0.9435 and 0.9175 0
0.8 1
RC [1], wCO [14], and KSR [10]. Here 814 we compare our method with them on other two datasets: inDU
0.4 0
0.4 0.4 0.6 0.8 1 0.4 0.4 0.4 0.4
0.1 0.1 0.1 0.1
Table 1. F-measure scores. The best Tableand the
1. 827
second
F-measure best results
scores. The are shown
best and in thered and
second green,
best respectively.
results are shown Supervi r
0.1
0.1 0.3
0.6 Threshold 0.3
-S 783
CCV
040
770
019 2017 Submission
0.3
Input
0.3
GT Ours
MSRA#726.
KSR 0.3 LEGS 772 CONFIDENTIAL
ICCV
MCDL 040 2017 DRFI 0 0 Submission
BSCA 0.3
BL
REVIEW DSR MR
COPY.
#726. HS RC
DO
CONFIDENTIAL NOT
wCO
DISTRIBUTE.
0 0 REVIEW COPY. DO NOT DISTRIBU
0.3 0.3
07
0.3
ce comparison 0SOD in
[6].terms Performance of0.6PR curves
0.8 comparison 1and 0 0 F-measure in 8280.40.4curves
terms of PR 0.6 are 0.8 shown
curves and in 0 Figure
F-measure 0.2 0.21. curves F-measure 0.6are scores
shown in
1 1 Figure
0.2
0.20.2815
0
in bold. in bold.
0 0
0.5 0.2 0.4 1 1 0.2
Table 2. AUC scores. Supervised method
0.2 0.6
784
0.2 0
0.20.9
0.2 0.4 0.6 0.2 0.8 1 0.1
0.2 0.8 0 0.4 0.4 0.6 0.2 0.8 0.8 0.2 0 0 0.2
041
771 041
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
020
.-measure
Visual curves. comparison
are shown
RecallThreshold
Figure inofTable 8. Visual
the 773
proposed
1. Visual comparison methodcomparison of against
saliency
Threshold
maps.
829 ofstate-of-the-art
816 the proposed method
Threshold
on0.1different against datasets
state-of-the-art
Threshold
Threshold
is shown on in0.1different 07
F-Measure
ECSSD HKU-IS
ECSSD PASCAL-S
HKU-IS
0.4
0.1 0.1 0.9 0.8 0.9
785
0.9
772
042 042 methods. DRFI, LEGS, MCDL, KSR
DSR 2. Equilibria 2.0 Equilibria
021 Figure 0
7,
0.2 8, 9, 0.4 10. 0.6 774 [28] 817 0.8
07
our method and RFCN wCOfine-tuned on
830different number
0.30.8 0 0 0.7 0 0.8
0 0.8 1
786 HS MR 0.2 RC 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2
F-Measure
F-Measure
7 023 1 1 1
819 0.5 1 1 1
07 0.5
0.9382 0.9posed
0.5
8045
0.9 0.9
024 0.8 0.9 0.8 820
0.9
07
performance. However, it can be seen 833 from the figure that
0.3
0.8782
0.9
0.7
0.8611 0.1 0.8530 0.8952 0.3 0.7
ECSSD 0.9143 0.8619 0.8838 E
777
0.3
0.3 0.3
789
776
046025 DRFI each 0.80.8 0.70.8
MCDL of the LEGS two-player KSR 046 games Ours [2]. each
0.8
0.8
Howson
of 821the et
two-player
0.2
al. showed games in [3]
[2]. 0.6 alsHowson
0.6 that
0.8
based
dataset
0.8
every onet R-CNN
polymatrix
BLal. showed0.2
DSRfeatures.
game in has
[3]
HS For that
0.8
ata07 fM
l
PASCAL-S 0.8671 0.8118 0.8362
0.2
PAS
0.6 0.6 0.6 0.2
0.8950 increases as- the00.9012 number of training0.6 0.5 0.2 samples 835 0.3 grows. Our
F-Measure
Precision
F-Measure
F-Measure
F-Measure
Precision
Precision
Precision
Precision
Precision
0.60.8699 0.8724
Precision
Precision
Precision
Precision
0791
0.5
0
0 0.2 0.4 0.1
0.6 0.8 1
0
0 0.4 0.1 0.2
PASCAL-S 0.6 0.3 0.8671 0.70.8118
0.4
0
0.8
0.8362 0.5 0.8 0
778
048027 - is0.5invoked
0.6 0.6
0.4
0.4 - to- find a0 0779
Threshold
Nash
- 048 equilibrium
0.9583 0.5 is 0.4invoked
0.4
Threshold
0.6
of 823 thetogame, find ainNash which different
equilibrium
Threshold
Threshold
0.4
KSR
0.6
MSRA
0.6
of methods
Nash the equilibria
game, 0.9535 oninMSRA-5000
might
0.9382 be
which
Threshold
reached
different 0.4
0.9279
Threshold datase
0.6
Na 08if
0.9
1049
792 0.9435 method
0.9175 is equivalent
0.9026 0.9099 to RFCN 0.1
0.9183 fine-tuned
0.3 0.2 836 on 0.3 about 0.4 5000- 0.50.4
HKU-IS
0.6
0.9140 BSCA
0.7
0.9008 DRFI
0.8 0.3 MCDL
0.8782
0.9
0.8
779 is0.4 0.3set to different interior 780 049points0.4 isof0.20.3set .toEmpirically,
different interior we find pointsthat of 8iall2 .randomly
Empirically, =select weDRFI images
find is
that afromgood 2thi in
0.5
0.3
028
0.5
0.5
824
0.3
0.5 0.5
I,
ECSSD z i (0)BSCA (0.5,
0.9176 0.5) 0.9404 8i0.9186
0.5
IE
08
2050
793
9000 0.4 images. 0.6 837 Threshold 0.3
MCDL 0.2 LE
plausible dataset saliency detection. bold.InHSthe plausible supplementary saliency material,
RC detection. we
In the test Further,
saliency
supplementary since detectionLEGS material, also
results selects
we correspon
test ima
sali
0.2 0.2
780 0.4 0.2 0.1
s.029 781DSR in050 825 PASCAL-S 0.8665 0.8950 0.8699 PAS 08
0.4 0.4 0.4 0.4
Supervised 0.3
methods are BL marked 0.3
MR wCO ECSSD 0.9176 0.9404 0.9186 0.9
794
0.10.3
0.1 0.2
0.1
838 0.2
0.1
MSRA 0.9484 -
0.1
0.3 - M
3051
781
030 equilibria,
0.3
0.2 ECSSD which are
0.6838
782 reached
0.6618051 byequilibria,
0.6347 0.2 Replicator
0.3 0.6905826which Dynamics
0.4560 are reached startingbyfrom
0.6764 dataset,
Replicator
0.3
four different
0.3 PASCAL-S
weDynamicsdo notinterior
0.8665 show
0.8950
starting its
points performan
0.8699
from of fou 08 .
EGS,
795
0.8
MCDL, 1
00
0.2
PASCAL-S KSR are supervised
0.5742 0.60.5575 0.8meth- 0.5314
0.1 0
0.5863 839 0.4039 0.5999
0.10
MSRA HKU-IS 0.4 0.9484 0.9140 - 0.9435 - 0
0 0.9175 H
SV dataset. Several saliency maps V1 are
000 0.2 0.4 0.6 0.8 11
1 state used in the paper as ,state
1 and the
used other
inKSR the four paper initial Vstates
as0.60.8DRFI 1 ,as and the ,V other ,Recall
four ,0.6V
0.4 initial . Each
states as of 0 th ,
782
00 0.2 0.4 0.6 0.8 1 0.2
0.8 0.2
1 0.2
0.2 0.4
0.4 0.6 0.8
half half 0.2 0.2bd
0 0.2
pos 0.4
obj0.6
0.6
prior
0.8
0.8 11
0.2 bd
4052
0.2 0.4 0.6 0.8 1
031 0.1
0 0.2 0 0.4 Threshold 0.6
Recall 052V
0.8 1 0.1 0.2
827 Threshold 0 HKU-IS V
0.9140 0.9435 0.9175 08
0.9
783
0.8 Threshold
0 MSRA
0 0.7840 0.6 0.7841 0.7671 BL0 0.8041 MR0.5754 0.7937
Threshold
s-S a randomInputSOD
forest
GT Ours 0.2
KSR
MSRA
regressor,
0.4
LEGS0.1
Recall MCDL
LEGS DRFI0.8
and
0.2 BSCAInput SOD 0
0.3 GT
0.2
DSR
Ours840 0.2
0.4
0.4 Recall
0.40.5
HS
LEGS
0.6
MCDL0.6
RC DUT-OMRON
wCO 0.8
BSCA 0.7 1 BL 0 0.8
0.2
DSR 0.2MR0.9 HS
0.4
RC 1 0.6DUT-OMRON
0.8
wCO 0.8
1
F-Measure
re
re
re
re
re
0.5 0.5
SUBMITTED TO IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 10
In addition, compared with CNN based saliency detection [23] D. A. Miller and S. W. Zucker. Copositive-plus lemke algorithm solves
methods which need to be trained on images with pixel- polymatrix games. Operations Research Letters, 10(5):285–290, 1991.
[24] V. Movahedi and J. H. Elder. Design and perceptual validation of
level masks as ground truth, the proposed method extracts performance measures for salient object segmentation. In Computer
features from a pre-trained CNN and combines them with Vision and Pattern Recognition Workshops, pages 49–56, 2010.
[25] Y. Qin, H. Lu, Y. Xu, and H. Wang. Saliency detection via cellular
color features in an unsupervised manner. This provides an automata. In Proceedings of IEEE Conference on Computer Vision and
efficient complement to CNNs that does on par with models Pattern Recognition, 2015.
that need labeled training data. Hopefully, our approach will [26] P. D. Taylor and L. B. Jonker. Evolutionarily stable strategies and game
dynamics. Journal of Theoretical Biology, 40(1-2):145–156, 1978.
encourage future models that can utilize both labeled and [27] N. Tong, H. Lu, X. Ruan, and M.-H. Yang. Salient object detection via
unlabeled data. bootstrap learning. In Proceedings of IEEE Conference on Computer
Vision and Pattern Recognition, pages 1884–1892, 2015.
[28] A. Torsello, S. R. Bulo, and M. Pelillo. Grouping with asymmetric
ACKNOWLEDGMENT affinities: A game-theoretic perspective. In Proceedings of IEEE Con-
ference on Computer Vision and Pattern Recognition, pages 292–299,
The authors would like to thank... 2006.
[29] Z. Tu, Z. H. Zhou, W. Wang, J. Jiang, and B. Wang. Unsupervised
R EFERENCES metric fusion by cross diffusion. In Proceedings of IEEE Conference
on Computer Vision and Pattern Recognition, pages 2997–3004, 2012.
[1] R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk. Frequency- [30] L. Wang, H. Lu, R. Xiang, and M. H. Yang. Deep networks for saliency
tuned salient region detection. In Proceedings of IEEE Conference on detection via local estimation and global search. In Proceedings of IEEE
Computer Vision and Pattern Recognition, pages 1597–1604, 2009. Conference on Computer Vision and Pattern Recognition, pages 3183–
[2] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk. 3192, 2015.
Slic superpixels. 2010. [31] L. Wang, L. Wang, H. Lu, P. Zhang, and R. Xiang. Saliency detection
[3] A. Albarelli, B. S. Rota, A. Torsello, and M. Pelillo. Matching as a non- with recurrent fully convolutional networks. In Proceedings of European
cooperative game. In Proceedings of the IEEE International Conference Conference on Computer Vision, 2016.
on Computer Vision, pages 1319–1326, 2009. [32] Q. Wang, W. Zheng, and R. Piramuthu. Grab: Visual saliency via novel
[4] A. Borji, M.-M. Cheng, H. Jiang, and J. Li. Salient object detection: graph model and background priors. In Proceedings of IEEE Conference
A benchmark. IEEE Transactions on Image Processing, 24(12):5706– on Computer Vision and Pattern Recognition, 2016.
5722, 2015. [33] T. Wang, L. Zhang, H. Lu, C. Sun, and J. Qi. Kernelized subspace
[5] A. Borji and L. Itti. State-of-the-art in visual attention modeling. IEEE ranking for saliency detection. In Proceedings of European Conference
Trans. Pattern Anal. Mach. Intell., 35(1):185–207, 2013. on Computer Vision, 2016.
[6] A. Chakraborty and J. S. Duncan. Game-theoretic integration for image [34] J. W. Weibull. Eolutionary Game Theory. MIT Press,, 1999.
segmentation. IEEE Transactions on Pattern Analysis and Machine [35] Q. Yan, L. Xu, J. Shi, and J. Jia. Hierarchical saliency detection.
Intelligence, 21(1):12–30, 1999. In Proceedings of IEEE Conference on Computer Vision and Pattern
[7] M.-M. Cheng, N. J. Mitra, X. Huang, P. H. Torr, and S.-M. Hu. Global Recognition, pages 1155–1162, 2013.
contrast based salient region detection. IEEE Transactions on Pattern [36] C. Yang, L. Zhang, H. Lu, X. Ruan, and M.-H. Yang. Saliency detection
Analysis and Machine Intelligence, 37(3):569–582, 2015. via graph-based manifold ranking. In Proceedings of IEEE Conference
[8] A. Deligkas, J. Fearnley, T. P. Igwe, and R. Savani. An empirical study on Computer Vision and Pattern Recognition, pages 3166–3173, 2013.
on computing equilibria in polymatrix games. 2016. [37] R. Zhao, W. Ouyang, H. Li, and X. Wang. Saliency detection by multi-
[9] A. Erdem and M. Pelillo. Graph transduction as a non-cooperative game. context deep learning. In Proceedings of IEEE Conference on Computer
Neural Computation, 24(3):700–723, 2012. Vision and Pattern Recognition, pages 1265–1274, 2015.
[10] J. Howson, Joseph T. Equilibria of polymatrix games. Management [38] W. Zhu, S. Liang, Y. Wei, and J. Sun. Saliency optimization from robust
Science, 18(5-Part-1):312–318, 1972. background detection. In Proceedings of IEEE Conference on Computer
[11] R. A. Hummel and S. W. Zucker. On the foundations of relaxation Vision and Pattern Recognition, pages 2814–2821, 2014.
labeling processes. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 5(3):267–287, 1983.
[12] B. Jiang, L. Zhang, H. Lu, C. Yang, and M. H. Yang. Saliency detection
via absorbing markov chain. In Proceedings of the IEEE International
Conference on Computer Vision, pages 1665–1672, 2013.
[13] H. Jiang, J. Wang, Z. Yuan, Y. Wu, N. Zheng, and S. Li. Salient
object detection: A discriminative regional feature integration approach. Yu Zeng received the B.S. degree in Electronic
In Proceedings of IEEE Conference on Computer Vision and Pattern and Information Engineering from Dalian University
Recognition, pages 2083–2090, 2013. of Technology (DUT), Dalian, China, in 2017. Her
[14] Y. Kong, L. Wang, X. Liu, H. Lu, and R. Xiang. Pattern mining saliency. current research interests include image processing
In Proceedings of European Conference on Computer Vision, 2016. and computer vision with focus on image statistics
[15] P. Krähenbühl and V. Koltun. Geodesic object proposals. In Proceedings and saliency detection.
of European Conference on Computer Vision, pages 725–739, 2014.
[16] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning
applied to document recognition. Proceedings of the IEEE, 86(11):2278–
2324, 1998.
[17] G. Lee, Y.-W. Tai, and J. Kim. Deep saliency with encoded low level
distance map and high level features. In Proceedings of IEEE Conference
on Computer Vision and Pattern Recognition, 2016.
[18] G. Li and Y. Yu. Visual saliency based on multiscale deep features.
In Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition, pages 5455–5463, 2015.
[19] X. Li, H. Lu, L. Zhang, X. Ruan, and M.-H. Yang. Saliency detection Mengyang Feng received B.E. degree in Electrical
via dense and sparse reconstruction. In Proceedings of the IEEE and Information Engineering from Dalian University
International Conference on Computer Vision, pages 2976–2983, 2013. of Technology in 2015. He is currently pursuing the
[20] Y. Li, X. Hou, C. Koch, J. M. Rehg, and A. L. Yuille. The secrets Ph.D. degree under the supervision of Prof. H. Lu
of salient object segmentation. In Proceedings of IEEE Conference on in Dalian University of Technology. His research
Computer Vision and Pattern Recognition, pages 280 – 287, 2014. interest focuses on salient object detection.
[21] T. Liu, J. Sun, N. N. Zheng, X. Tang, and H. Y. Shum. Learning to detect
a salient object. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 33(2):353–67, 2011.
[22] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks
for semantic segmentation. In Proceedings of IEEE Conference on
Computer Vision and Pattern Recognition, pages 1337–1342, 2015.
0108
0078
SUBMITTED TO IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 11
0126
0194
0079
0209
0090
0234
0247
0175
0113
0253
0165
0336
0543
0594
0962
file:///home/xyz/cvpr17/result/main_ecssd.html 1/1
1449
1652
Huchuan Lu received the Ph.D. degree in System Ali Borji received the B.S. degree in Computer
Engineering and the M.S. degree in Signal and Engineering from the Petroleum University of Tech-
Information Processing from Dalian University of nology, Tehran, Iran, in 2001, the M.S. degree in
Technology (DUT), Dalian, China, in 2008 and computer engineering from Shiraz University, Shi-
1998, respectively. He joined the faculty in 1998 raz, Iran, in 2004, and the Ph.D. degree in cognitive
and currently is a Full Professor of the School of neurosciences from the Institute for Studies in Fun-
Information and Communication Engineering, DUT. damental Sciences, Tehran, Iran, in 2009. He spent
His current research interests include computer vi- four years as a Post-Doctoral Scholar with iLab,
sion and pattern recognition with focus on visual University of Southern California, from 2010 to
tracking, saliency detection, and segmentation. He 2014. He is currently an Assistant Professor with the
is a member of the ACM and an Associate Editor University of Wisconsin, Milwaukee. His research
of the IEEE Transactions on Cybernetics. interests include visual attention, active learning, object and scene recognition,
and cognitive and computational neurosciences.
file:///home/xyz/cvpr17/result/main.html 1/1
344
3/19/2017 main_MSRA.html
SUBMITTED TO IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. XX, XX 12
0_0_77 354
394
0_0_272
492
566
0_0_280
596
0_0_284
703
0_2_2310
727
0_3_3434
773
0_7_7677
0_9_9837
0_9_9938
0_10_10545
file:///home/xyz/cvpr17/result/main_pascals.html 1/1
1_25_25389
0_18_18734
0_25_25057