Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO.

3, MARCH 2024 755

Multiview Fuzzy Clustering Based on Anchor Graph


Weizhong Yu , Liyin Xing , Feiping Nie , Senior Member, IEEE, and Xuelong Li, Fellow, IEEE

Abstract—With the development of information technology, a each single view, thus achieving better clustering performance
large number of multiview data has emerged, which makes multi- than directly concatenate the features of different views.
view clustering algorithms considerably attractive. Previous graph- Multiview clustering algorithms has been widely used in
based multiview clustering methods usually contain two steps:
obtaining the fusion graph or spectral embedding of all views; image classification [1], natural language processing [2], [3],
and performing clustering algorithms. The two-step process cannot [4], and bioinformatics and health informatics [5], [6]. The
obtain optimal results since the two steps cannot negotiate with graph-based multiview clustering algorithms have also attracted
each other. To address this drawback, a novel algorithm named as lots of interests due to their excellent performance [7], [8]. Many
multi-view fuzzy clustering based on anchor graph is presented. graph-based multiview clustering methods usually first learn a
The proposed method can simultaneously obtain the membership
matrix and minimize the disagreement rates of different views. A fusion graph for all views, and then, conduct an additional clus-
novel regularization based on trace norm is also presented in this tering algorithm on the fusion graph to obtain the final clustering
article, which can not only obtain a clear clustering partition to results. This kind of algorithm can better fuse the multiview
prevent that all samples belonging to each cluster with the same information, but it cannot directly obtain the clustering results.
membership value 1c , but also balance the size of each cluster. More- In order to avoid the two-step process, Nie et al. [9] propose the
over, we exploit the reweighted method to optimize the proposed
model, which can introduce an adaptive weight to each view to deal constrained Laplacian rank (CLR) algorithm, which can obtain
with the unreliable views. A series of experiments are conducted a graph with c connected components directly. The advantages
on different datasets, and the clustering performance verifies the of the CLR has been clarified and widely applied in graph-based
effectiveness and efficiency of the proposed algorithm. clustering algorithms [10]. Nie et al. [11] propose a multi-view
Index Terms—Fuzzy clustering, graph, multiview clustering, clustering algorithm that can adaptively assign weights to each
reweighted optimization framework. view, and then, divided the fusion graph into c components via
CLR algorithm. However, it takes O(n2 c) to optimize the model
I. INTRODUCTION
(where n is the number of samples and c is the number of
LUSTERING algorithm is an important research direction
C in machine learning, whose purpose is to divide samples
with higher similarity into the same category. With the devel-
clusters). In order to accelerate it, Li et al. [12] introduced a
bipartite graph into this framework and proposed an algorithm
that can be applied to large-scale datasets.
opment of information technology, a large number of multiview However, these methods lose some information in the original
data has emerged, such as documents in different languages, data during the fusion. Therefore, in order to better exploit
images, sounds and subtitles in videos, and faces in different the complementary information between multiple views, some
angles and light. How to process and divide these multiview methods integrate graph construction and fusion into a unified
data has become a new problem. Usually, multiview clustering framework, which have attracted widespread attention. Brbić
algorithms use the principle of consensus and complementarity and Kopriva [13] learns a joint subspace representation by con-
to explore the underlying data clustering structure shared be- structing the affinity matrix shared among all views. Although
tween different views, synthesize the different information of these multiview subspace clustering algorithms have a good
clustering performance, they usually suffer from high computa-
Manuscript received 10 March 2023; revised 28 June 2023 and 8 August 2023; tional complexity and cannot apply to large-scale datasets. To
accepted 14 August 2023. Date of publication 18 August 2023; date of current
version 1 March 2024. This work was supported in part by the Key Research
solve this problem, Kang et al. [14] and Wang et al. [15] pro-
and Development Program of Shaanxi under Grant 2023YBGY034, and in part posed several linear complexity algorithms inspired by the idea
by the National Natural Science Foundation of China under Grant 62176212. of anchor graph. In addition to subspace learning, Huang et al.
Recommended by Associate Editor P. D’Urso. (Weizhong Yu and Liyin Xing
contributed equally to this work.) (Corresponding author: Feiping Nie.)
[16] proposed an algorithm that learns similarity relationships in
Weizhong Yu, Feiping Nie, and Xuelong Li are with the School of Artificial kernel spaces. Some scholars also use the adaptive neighborhood
Intelligence, OPtics and ElectroNics (iOPEN), the School of Cybersecurity, method to obtain the similarity matrix of each view, and then,
Northwestern Polytechnical University, Xi’an 710072, China, and also with the
Key Laboratory of Intelligent Interaction and Applications (Northwestern Poly-
fuse them to generate a unified similarity matrix [17]. Li and
technical University), Ministry of Industry and Information Technology, Xi’an He [18] also adapt the strategy but introduce the bipartite graph
710072, China (e-mail: yuwz05@mail.xjtu.edu.cn; feipingnie@gmail.com; into it to accelerate the algorithm. Recently, in order to further
li@nwpu.edu.cn).
Liyin Xing is with the School of Cybersecurity and the School of Artificial
preserve the complementary information and spatial structure
Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical Uni- in different views, Wang et al. [19] and Liang et al. [20] focus
versity, Xi’an 710072, China (e-mail: liyinxing@mail.nwpu.edu.cn). on the representation complementarity between different views
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TFUZZ.2023.3306639.
by introducing an additional exclusivity term. Huang et al.
Digital Object Identifier 10.1109/TFUZZ.2023.3306639 [21] simultaneously detected the multiview consistency and the

1063-6706 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
756 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024

multiview diversity, and then, fused the consistent parts to an


target graph with a clear clustering structure. Following this,
Huang et al. [22] introduced a novel algorithm that unifies the
adaptive learning of similarity graphs alongside the detection of
consistency and cross-view diversity. Li et al. [23] and Shu et al.
[24] introduce tensors into the algorithm to explore higher level
information between different views.
Another kind of graph-based algorithms aim to obtain the
consistent spectral embedding from the similarity graph of each
view. Kumar et al. [2] proposed the coregularized multiview
spectral clustering algorithm based on the idea of collaborative
training. However, this method cannot distinguish the reliability
of different views. In order to minimize the effect of unreliable
views, Xia et al. [25] introduce an additional parameter as
the index of weight, and then, adaptively learn the weights
Fig. 1. Algorithmic flows for the proposed algorithm.
during the optimization process, but this parameter has a greater
impact on the clustering effect. Nie et al. [26] propose a method
that can automatically learn the optimal weight of each graph the membership matrix directly without a postprocessing
without parameter tuning. Li et al. [27] introduce a bipar- step.
tite graph to accelerate the algorithm so that it can adapt to 2) We propose a novel regularization based on trace norm,
large-scale datasets. Although these algorithms fuse the spectral which can not only obtain a clear clustering partition to
embedding of all views, they still need a postprocessing step prevent that all samples belonging to each cluster with the
to obtain the final clustering results. To solve this problem, same membership value 1c , but also avoid that all samples
Shi et al. [28] use the spectral rotation (SR) technology to belonging to the same cluster by balancing the size of each
obtain clustering results from the spectral embeddings. Nie cluster. Furthermore, we have provided a theoretical proof
et al. [29] use SR to learn consistent clustering results from of this property in this article.
spectral embedding of multiple views. Shi et al. [30] propose 3) We exploit the reweighted method to optimize the pro-
a framework that jointly learns consistent spectral embedding posed model, which can not only solve the nonconvex
and final clustering results via SR. Similarly, Qiang et al. [31] minimization problem but also adaptively add an weight
propose a new framework that can directly obtain indicator for each view. Therefore, the proposed algorithm is able
matrix on the fusion graph of all views to avoid the two-step to deal with the unreliable views.
process. Yang et al. [32], Yang et al. [33], Shi et al. [34], and 4) Most of the fast algorithms is based on the anchor graph,
Hu et al. [35] use the non-negative matrix factorization (NMF) which is also employed by the proposed algorithm. By this
method to directly obtain the indicator matrix. way, the computational complexity of our approach will
To summarize, there is a lot of research focusing on the be reduced, making it suitable for large datasets.
two crucial problems for graph-based multiview algorithms: The rest of this article is organized as follows. We review the
postprocessing step and high computational complexity. How- works related to anchor graph in Section II. Subsequently, our
ever, these methods can only identify which cluster the sample new method for multiview clustering is presented in Section III.
belongs to and do not provide information about the membership Experiments on different datasets are conducted to validate the
between different clusters. Nevertheless, fuzzy clustering plays effectiveness and efficiency of the proposed method in Section
a crucial role in various practical applications. For example, IV. Finally, Section V concludes this article.
in the field of marketing, fuzzy clustering can be a valuable Notations: Throughout this article, Tr(M) denotes the trace of
tool for grouping customers based on diverse perspectives such 
a matrix M, while MF = Tr(MT M) denotes the Frobe-
as their needs, brand preferences, psycho-graphic profiles, or nius norm of M. ρ√i (M) denotes the ith eigenvalue of the matrix
other relevant marketing factors. Therefore, graph-based fuzzy c 
M. M∗ = Tr( M M) = i=1 ρi (MT M) denotes the
T
clustering is worth exploring to address these challenges and
trace norm of M. x2 = Tr(xT x) denotes the 2-norm of a
improve the clustering performance in real-world scenarios. In
vector x.
this article, we present a new multiview fuzzy clustering based
on the anchor graph (abbreviated as MVFCAG) that is designed
to address these problems. The algorithm flow of the proposed II. RELATED WORKS
MVFCAG is illustrated in Fig. 1. The main contributions of this
article can be summarized as follows. A. Anchor-Based Graph Model
1) We integrate anchor graph construction and membership A crucial problem for graph-based algorithms is the construc-
matrix learning into a unified framework to obtain the tion of the similarity graph. Traditional approaches to graph
membership matrix Fi for each single view, and then, construction include k-nearest neighbor graphs, -neighborhood
we minimize the disagreement rate between each Fi and graphs, and fully connected graphs. However, these graphs re-
the final membership matrix Y to fuse the information in quire high computational complexity to construct and are unsuit-
different views. Thus, the proposed approach can obtain able for large datasets. Therefore, some fast graph construction
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
YU et al.: MULTIVIEW FUZZY CLUSTERING BASED ON ANCHOR GRAPH 757

methods via a bipartite graph have recently been proposed [36],


Algorithm 1: Reweighted Method for Problem (5).
[37].
In order to construct a similarity graph efficiently via a bipar- Input: x ∈ Ω
tite graph, we first need to find a subset U = [u1 , u2 , . . . , um ] ∈ 1: while nor converge do
Rm×d as anchor points. The anchors can be selected via k-means 2: Update each Di by Di = hi (g
i (x)).

clustering, or simply at random. Let B ∈ Rn×m represent the 3: Update x by x = arg minx∈Ω i T r(DiT gi (x)).
bipartite similarity graph between anchors and samples, where 4: end while
bij is the similarity between the sample xi and the anchor uj . Output: x.
Since each row of the matrix needs to be normalized, we can
calculate B as follows:
 K(xi ,uj ) membership matrix of anchors. Thus, the membership matrix of
 ∀j ∈ Φi data points can be expressed as F = BZ.
s∈Φi K(xi ,us )
bij = (1)
0 j∈/ Φi
C. Reweighted Optimization Framework
where Φi is the set containing the indexes of the k nearest anchors Recently, Nie et al. [40] put forward a novel reweighted
of xi , while K(·) is a kernel function. optimization framework applicable to a range of nonconvex
Rather than using a Gaussian kernel function to calculate the problems efficiently. The formulation of the nonconvex prob-
bipartite graph B, Nie et al. [9] propose an effective adaptive lems that can be solved by the method is shown as follows:
neighbor assignment strategy, and He et al. [38] adopt it to 
construct the anchor graph, which can be used to obtain the min hi (gi (x)), s.t. x ∈ Ω (5)
x
ith row of B in the following way: i


m where the composite function hi (gi (x)) is a concave function
min xi − uj 2 bij + γbij 2 (2) of the subfunction gi (x), and x ∈ Ω is the feasible region of x.
(b)i T 1=1,bij ≥0 j=1 i (gi (x))
Let hi (gi (x)) = ∂h∂g i (x)
, then the solution to the problem (5)
where bi T represents the ith row of B, while γ is a parameter based on the reweighted optimization framework is presented
that can be optimized as γ = k2 d(i, k + 1) − 12 kj=1 d(i, j) in Algorithm 1. In this article, we will exploit the reweighted
framework to optimize the proposed model.
according to [9]. d(i, j) = xi − uj 2 represents the distance
between xi and its jth nearest anchor uj . This method is also a
III. METHODOLOGY
nonparametric way to obtain a normalized bipartite graph. The
solution of problem (2) is as follows: A. Motivation
 d(i,k+1)−d(i,j) As mentioned in Section II, the membership matrix of each
k ∀j ∈ Φi
bij = kd(i,k+1)− j=1 d(i,j) (3) view can be obtained by Fi = Bi Zi . Thus, the specific cluster-
0 j∈/ Φi ing structure of each view can be easily revealed. In order to get a
consist clustering partition of all views, the membership matrices
where Φi still represents the set containing the indexes of the k
Fi should be close to each other. Therefore, we can introduce
nearest anchors of xi .
Y ∈ Rn×c as the final membership matrix of all views, which
can be obtained by solving the following problem:
B. Anchor-Based Fuzzy Clustering

v
Fuzzy clustering is one of the most popular clustering min Bi Zi − YF
Zi ,Y
algorithms but suffers from high computational complexity. i=1
Nie et al. [39] integrated anchor-based similarity graph construc-
s.t. Zi 1 = 1, Zi  0, Y1 = 1, Y  0 (6)
tion and membership matrix learning into a unified framework
to improve the clustering performance on large-scale datasets. where MF = T r(MT M) denotes the Frobenius norm of the
They designed a quadratic programming model to learn the matrix M.
membership matrix of selected anchors. Then, the anchor graph However, the problem (6) cannot divide the data points clearly
that contains connectivity between data points and anchors can into clusters. We put forward a novel regularization to deal
be exploited to calculate the membership of all the data points. with it in Section III-B, and then, exploit the reweighted frame-
The membership value fij that data point i belongs to the cluster work [40] to introduce an adaptive weight for each view during
j can be calculated by the weighted sum of the membership optimization.
values of all anchors belonging to this cluster, shown as follows:

m B. Multiview Fuzzy Clustering Based on Anchor Graph
fij = bi1 z1j + bi2 z2j + · · · + bim zmj = bil zlj (4) (MVFCAG)
l=1
In order to achieve a clear clustering assignment, we introduce
where m is the number of anchors, zij is the membership value the novel regularization as shown by problem (7), which can not
of the anchor i belonging to cluster j, and Z ∈ Rm×c is the only clearly divide the data points but also balance the size of

Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
758 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024

each cluster. In this case, we can calculate that YT Y ∈ Rc×c is a diagonal


matrix where the ith diagonal element is the number of samples
max Y∗ (7)
Y1=1,Y0 in the ith cluster. Thus, (8) can be rewritten as
√  
where Y∗ = Tr( YT Y) = ci=1 ρi (YT Y) denotes the √  c

trace norm (also known as the nuclear norm) of Y. The trace Y∗ = T r YT Y = ni . (12)
i=1
norm serves as a convex envelope of the rank function rank(M)
and is commonly employed in mathematical optimization to We can further transformed (9) into
identify low-rank matrices. In our approach, we maximize the √ √ √ 2
trace norm to prevent the occurrence of empty clusters. Addi- ( n1 + n2 + · · · + nc )  (n1 + n2 + · · · + nc ) c
√ √ √ √
tionally, by imposing the constraint Y1 = 1, Y  0 in problem ⇔ n1 + n2 + · · · + nc  nc. (13)
(7), we not only address the issue of empty clusters, but generate
the most clear and balanced clustering results. According to the Cauchy–Schwarz inequality [41], the equality
The proof of why maximizing the problem (7) can achieve only holds when n1 = n2 = · · · = nc = nc . Therefore, Y∗
the most clear and balanced clustering results is given in achieves its maximum value nc when Y is a discrete indicator
Theorem 1. matrix and n1 = n2 = · · · = nc = nc . 
Theorem 1: Given n1 + n2 + · · · + nc = n and nj  0 According to Theorem 1, we can come to the conclusion that
√the size of the jth cluster, Y∗ arrives its maximum
denotes maximizing the problem (7) can introduce competition among
value nc when Y∈Rn×c is a discrete indicator matrix and clusters so that the membership will be more clear and the size
nj = nc . of clusters will be more balanced. That is to say, the data points
Proof: The trace norm of Y can be written as prefer to be clear divided with a large relationship value into
c  c balanced clusters with the size of nc . Therefore, the problem

Y∗ = ρi (YT Y) . (8) (7) can be a regularization for a clear clustering partition and a
i=1
balanced size of clusters.
 Combining problem (6) and problem (7) with a regularization
Let p ∈ Rc×1 be a column vector, where pi = ρi (YT Y) parameter λ, the overall optimization problem can be formulated
and q ∈ R1×c be row vector with all elements as 1. According as follows:
to the Cauchy–Schwarz inequality [41], | p, q |2  p22 q22 ,

v
we have min Bi Zi − YF − λ Y∗
  Zi ,Y
(p1 + p2 + · · · + pc )2  p21 + p22 + · · · + p2c (1+1+· · · + 1) i=1

c s.t.Zi 1 = 1, Zi  0, Y1 = 1, Y  0. (14)

c
 
⇔ Y2∗  c ρi YT Y . (9) C. Optimization
i=1
1) Update Y With the Fixed Zi : In order to address problem
Then, the problem (7) can be transformed into the following (14), we iteratively optimize the variables Zi and Y. When
maximization problem: updating Y with the fixed Zi , we exploit the reweighted
 method [40] to deal with the optimization problem. Therefore,
c
 
max ρi YT Y we need to transform the problem (14) into the same form as
Y1=1,Y0
i=1 problem (5). Thus, we can set
 
⇔ max T r YT Y gi (Y) = Bi Zi − Y2F , hi (gi (Y)) = (gi (Y)) 2
1

Y1=1,Y0
⎛ ⎞
G (Y) = Y, H (G (Y)) = −λ G (Y)∗ . (15)

n c
⇔ max ⎝ 2⎠
yij . (10)
Y1=1,Y0 Then, the problem (14) can be transformed into
i=1 j=1

v
It is obvious that the problem (10) is independent for each i. min hi (gi (Y)) + H (G (Y)) . (16)
Thus, the problem (10) can be divided into n subproblems. For Y1=1,Y0
i=1
each i, we have
It is obvious that the gradient of hi (gi (Y)) is

c
2
max yij . (11) 1
yi 1=1,yi 0 hi (gi (Y)) = . (17)
j=1 Bi Zi − Y2F
The solution to the maximum problem (11) should be Then, we need to calculate the gradient of the trace norm. If
achieved when yi contains one and only one element equalling the reduced singular value decomposition of Y is UΣVT , it is
to 1 and 0 otherwise, and the maximum value should be 1. There- shown in [42] that the subgradient of Y∗ is
fore, we can conclude that problem (10) reached its maximum  
value only when Y is the discrete cluster indicator matrix. ∂ Y∗ = UVT +W : UT W = 0, WV = 0, W2  1 .

Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
YU et al.: MULTIVIEW FUZZY CLUSTERING BASED ON ANCHOR GRAPH 759

Here, we present one of the subgradient of Y∗ that is It is obvious that problem (23) is independent for each row of
employed in this article Zi , thus it can be transformed into
∂ Y∗  − 1    2
  zi 
= Y YT Y 2 min  b B − Y 
∂Y zi 1=1,zi 0  i r
Zr 
 T − 12 F

= UΣVT UΣVT UΣVT ⇔ min bi zi + Br Zr − Y2F


zi 1=1,zi 0

= UVT . (18)  
⇔ min T r zTi bTi bi zi + 2zTi bTi (Br Zr − Y)
zi 1=1,zi 0
Thus, we can conclude that one of the subgradient of  2
H(G(Y)) is  (Br Zr − Y)T bi 
 
⇔ min zi −  (24)
 − 1 zi 1=1,zi 0  bTi bi 
H  (G (Y)) = −λY YT Y 2 . (19) 2

According to Algorithm 1, problem (16) can be transformed where zi denotes the ith row of Zi and bi denotes the ith column
into problem (20) since hi (gi (Y)) and H(G(Y)) are both of Bi .
concave with the subfunction. Problems (22) and (24) share the same formulation, which
 can be rewritten as a more compact form
v
 
min T r (di gi (Y)) + T r −λDT G (Y) 1
Y1=1,Y0
i=1 min x − v22 s.t.xT 1 = 1, x  0 (25)
x 2

v
 
⇔ min di Bi Zi − Y2F − λT r DT Y (20) where x and v denote any column vectors. This problem have
Y1=1,Y0 been solved by Huang et al. [43], and then, we will present the
i=1

−1
solution. First, we should write the Lagrangian function of the
1
where di = and D = Y(YT Y) 2 . Note that di is
Bi Zi −Y2F problem (25) as
also a adaptive weight for each view, which can distinguish the  
1
reliability of different views. L (x, α, β) = x − v22 + α xT 1 − 1 − β T x (26)
2
To solve problem (20), we can rewrite it as follows:
where α and β are the Lagrangian multipliers, both of which are

v
 
min di Bi Zi − Y2F − λT r DT Y to be determined. Suppose the optimal solution to the minimiza-
Y1=1,Y0
i=1 tion problem (25) is x∗ , the associate Lagrangian multipliers
are α∗ and β ∗ . According to the KKT condition, we have the

v
   
⇔ min di T r (Bi Zi −Y)T (Bi Zi −Y) −λT r YT D following equations:
Y1=1,Y0
i=1 ⎧ ∗ ∗ ∗
⎪∀j, xj∗ − vj + α − βj = 0


(27)

v
    ∀j, xj  0 (28)
⇔ min di T r YT Y − 2YT Bi Zi − λT r YT D ∗
Y1=1,Y0 ⎪
⎪ ∀j, βj  0 (29)
i=1 ⎩ ∗ ∗
v  ∀j, xj βj = 0 (30)
 
v
 
⇔ min Tr di Y Y−2Y
T T
diBi Zi −λT r YT D where x∗j is the jth element of vector x∗ , vj is the jth element
Y1=1,Y0
i=1 i=1 of vector v, and βj∗ is the jth element of vector β ∗ . Equation
 v 2 (27) can be written as x∗j = vj + βj∗ − α∗ . According to (30),
 1 
i=1 di Bi Zi + 2 λD 
⇔ 
min Y − v we have βj∗ = 0 or x∗j = 0. Therefore, x∗j = vj − α∗ if βj∗ = 0,
 . (21)
Y1=1,Y0 di i=1 F x∗j = 0 if βj∗ = α∗ − vj . And for (28) and (29), we have
v
d B Z + 12 λD
Let M = i=1 i v i dii , then the problem (21) can be vj − α ∗ vj − α ∗  0
i=1
x∗j =
rewritten as follows: 0 vj − α ∗ < 0
min Y − M2F s.t. Y1 = 1, Y  0 = (vj − α∗ )+ (31)
Y

⇔ min yi − mi 22 s.t. yi 1 = 1, yi  0 (22) where x+ = max(x, 0). According to the constraint xT 1 = 1,
yi
we have
where yi is the ith row of Y and mi is the ith row of M. The

c
problem (22) can be seen as a proximal problem, which will be (vj − α∗ )+ = 1. (32)
discussed in detail later. j=1
2) Update Each Zi With the Fixed Y: When updating each 
Zi with the fixed Y, the problem (14) can be transformed into Define a function as f (α) = cj=1 (vj − α)+ − 1, and then,
the following: we can obtain α∗ by solving the root finding problem. Note
that when α∗  0, f  (α)  0 and f (α) is a piecewise linear and
min Bi Zi − Y2F . (23) convex function, we can use the Newton method to find the root
Zi 1=1,Zi 0

Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
760 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024

views are complete. The anchors are selected by k-means and


Algorithm 2: The Algorithm of MVFCAG.
the input affinity matrix is obtained by (2) and (3).
Input: The anchor graphs {B1 , B2 , · · · Bv } of all views.
The number of clusters c. The regularization parameter λ.
A. Experimental Results on Synthetic Datasets
1: Initialize: Randomly initialize each Zi and Y in their
domains of definition. 1) Settings: To prove the effectiveness of the proposed
2: while not converge do model, we conduct two toy experiments, the results of which are
3: Fix Zi , update Y via the re-weighted method: shown in Figs. 2 and 3. We first test the clustering performance of
4: while not converge do the proposed method, and then, verify its capability to deal with
5: Update each di = B Z 1−Y2 and the unreliable views on two synthetic datasets, which is shown
i i F
−1 by observing the visualized clustering results. The number of
D = Y(YT Y) 2 . anchors is set to be 100 and the regularization parameter λ is
6: Update Y by solving problem (22). 2.7.
7: end while 2) Datasets: In this section, we generate two datasets that
8: Fix Y, update each Zi sequentially: both contain three views and 300 samples. One of the datasets
9: for each Zi do consists of three two moon datasets with different width and
10: for each row of Zi do distance for each view, as shown in Fig. 2(a). And the other
11: Update the ith row of Zi by solving problem consists of three spherical datasets but are overlapped in each
(24). view, which are shown in Fig. 2(d). In order to evaluate the
12: end for ability to weaken the effectiveness of unreliable views, we add
13: end for a view of noise to both of the synthetic datasets, which is shown
14: end while in Fig. 3(a) and (d) .
Output: The membership matrix Y. 3) Results: Figs. 2 and 3 show the experimental results on
synthetic datasets. Different views of the datasets are shown by
the first column of subfigures, where different colors indicate
of f (α) = 0 efficiently, i.e., the different classes of the data points. The second column of
subfigures are the anchor graphs constructed in advance. It can be
f (αt )
αt+1 = αt − (33) seen that both of the datasets are hard to divide in any single view,
f  (αt ) even for graph-based methods. The third column of subfigures
where t denotes the iterations. show the clustering results obtained by the proposed algorithm.
Different colors indicate that which cluster the data point belongs
to and the purity of the color indicates the uncertainty. Fig. 2
D. Complexity Analysis
shows that our method is effective for multiview datasets, since
The proposed algorithm consists of two stages: 1) update Y it can not only clearly divide the samples with a very small
via the reweighted method; and 2) update the membership matrix between-class distance in each view, but also succeed to cluster
Zi of each view. It takes O(nmc) to obtain the membership ma- the samples completely overlap in some views. Fig. 3 shows the
trix Y in the first stage and O(nm2 c) to obtain the membership ability of our method to deal with the unreliable views. Although
matrixes of all views since the algorithm is independent for each the uncertainty of the samples at the edge of clusters is larger
view and suitable for parallel computing in the second stage. while there is a noise view, as observed in Fig. 3(c) and (f), the
Therefore, the total computational complexity of the proposed proposed method can still correctly divide them into clusters. We
algorithm is O(nm2 c). also present the adaptive weight di of each view calculated by
the reweighted method. It can be seen that the weight of the noise
IV. EXPERIMENTS view is smaller than others, which proves that the algorithm is
able to reduce the impact of unreliable views.
In this section, we conduct several experiments to validate
the advantages of the proposed algorithm. All experiments
are conducted on a desktop computer with a 2.90-GHz Intel B. Experimental Results on Real-World Datasets
Core i7 CPU and 32.0-GB RAM, MATLAB 2020b (64 bit). 1) Settings: The proposed algorithm is evaluated in terms of
The first group of experiments is to show the effectiveness of two aspects: one is the clustering performance, and the other
the proposed MVFCAG by presenting the visualization results is the efficiency. The clustering performance is evaluated with
on the synthetic datasets. The learned clusters will prove the reference to clustering accuracy (ACC) and normalized mutual
capability to obtain the sharing structure of all views, and simul- information (NMI). ACC calculates the percentage of correctly
taneously, overlook the unreliable views. The second group will clustered samples, while NMI computes the mutual information
show the clustering results on real-world datasets to demonstrate between cluster labels and real labels. The efficiency is evaluated
the superiority of MVFCAG compared with baselines, including with reference to the time consumption of each algorithm. For
clustering performance and running time. The third group is to the proposed MVFCAG the number of anchors is set to be about
test the parameter sensitivities of the number of anchors m and 10% of samples and the number of nearest neighbors is set to
the regularization parameter λ. In this article, we assume all be 5. The value of the regularization parameter λ is presented

Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
YU et al.: MULTIVIEW FUZZY CLUSTERING BASED ON ANCHOR GRAPH 761

Fig. 2. Experimental results for synthetic datasets without noise views. (a) and (d) Two datasets for experiment and different colors indicate the different classes
of the samples. (b) and (e) Anchor graph of each view constructed in advance along with the adaptive weight calculated by the algorithm. (c) and (f) Clustering
results obtained by the proposed algorithm and the different colors point out that which cluster the data point belongs to according to the proposed algorithm.
(a) Data. (b) Graph. (c) Result. (d) Data. (e) Graph. (f) Result.

Fig. 3. Experimental results for synthetic datasets with a noise view. (a) and (d) Two datasets for experiment and different colors indicate the different classes
of the samples. (b) and (e) Anchor graph of each view constructed in advance along with the adaptive weight calculated by the algorithm. (c) and (f) Clustering
results obtained by the proposed algorithm and the different colors point out that which cluster the data point belongs to according to the proposed algorithm.
(a) Data. (b) Graph. (c) Result. (d) Data. (e) Graph. (f) Result.

in Table II. We will provide a simple method to determine the and handwritten digits (UCI Digits [49] and MNIST [50]), as
value of λ by observing the objective function, as discussed in the described in Table I.
upcoming section. To randomize the experiments, each method 3) Baselines: To evaluate the effectiveness and efficiency
is run for 20 times and the average results are recorded in the of the approach proposed in this article, we compare it with
tables. several representative approaches, including four graph-based
2) Datasets: The proposed algorithm and the compared ap- multiview clustering approaches auto-weighted multiple graph
proaches are tested on seven real-world datasets. The real- learning (AMGL) [26], self-weighted multi-view clustering
world datasets include outdoor scene images (MSRC [44] (SwMC) [11], Co-Reg [2], and graph-based multi-view
and scene [45]) face images (ORL [46]), articles collection clustering (GMC) [17]), two multiview subspace clustering
(WikipediaArticles [47]), object images (Caltech101-20 [48]), algorithms fast parameter-free multi-view subspace clustering

Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
762 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024

TABLE I
BRIEF INTRODUCTION TO REAL DATASETS

TABLE II the proposed MVFCAG, is effective since both FPMVS-CAG


PARAMETER SETTINGS
and MVFCAG outperforms LMVSC a lot. AIMC is a fast mul-
tiview clustering method that effectively learns a shared cluster
indicator matrix for all views. Its fusion strategy is mapping
the original observations from each view into a latent integral
space, and then, obtaining the sharing cluster indicator matrix
through nonnegative matrix factorization. Our method surpasses
AIMC’s performance on nearly all datasets, with the exception
of WikipediaArticles. Consequently, we can confidently assert
that our proposed method significantly improves the data par-
titioning process. LFA is a late-fusion multiview method that
with consensus anchor guidance (FPMVS-CAG) [15] and aligns the cluster partitions of each view to achieve a common
large-scale multi-view subspace clustering (LMVSC) [14]), clustering result. However, late-fusion methods often face a
two fast multiview clustering algorithms scalable and significant challenge: their performance relies heavily on the
parameter-free multi-view graph clustering (SFMC) [12] quality of partitioning within each individual view, potentially
and adaptively-weighted integral space for Fast multi-view leading to poor clustering results for LFA. In contrast, our
clustering (AIMC) [51]), a late fusion multi-view clustering method also incorporates direct fusion of the membership matrix
method late fusion alignment maximization (LFA) [52]), and a from each view, but we address the aforementioned problem by
multiview fuzzy clustering method (MVFCM [53]). The relevant adjusting the labels of anchor points in each view based on the
information on these methods are presented in Table VII. common membership matrix. Additionally, this strategy does
4) Results: Tables III and IV show the clustering results of not compromise the computational superiority of late-fusion
the proposed method and baselines on the seven real-world methods, as the number of anchor points is usually much smaller
datasets. We highlight the best results in bold and underline the than the number of samples. MVFCM is a multiview fuzzy clus-
second best results. From the tables, it can be observed that the tering method that conducts fuzzy c-means (FCM) in each single
proposed method MVFCAG performs better than the baselines. view, and then, generates the membership matrix via collabo-
As shown in the tables, our method MVFCAG can ensure rative training. In comparison, our proposed MVFCAG method
the clustering effect of the top place for 4 out of 7 datasets demonstrates remarkable superiority over MVFCM across all
(MSRC, scene, Caltech101-20, and MNIST) in terms of ACC datasets, highlighting the advantages of our approach.
or NMI while get the second place for another two datasets We also conduct several statistical tests to confirm the su-
(ORL and UCI Digits). Compared with the graph-based SwMC, periority of the proposed method, whose results are shown in
GMC, and SFMC, our method learns a sharing clustering results Tables V and VI. Here, we use t-test (also named student’s test)
of all views instead of learning a sharing similarity graph. to confirm if the proposed MVFCAG outperforms the compared
While the advantages of the proposed MVFCAG in terms of methods. The null hypothesis H0 is that the proposed MVFCAG
clustering results may not be highly apparent, it significantly cannot obtain better performance than the comparing algorithm,
improves the algorithm’s efficiency. The other two graph-based and h = 1 means the proposed MVFCAG outperforms the com-
algorithms, AMGL and Co-Reg, learn the sharing clustering pared methods. p-value is the probability that the null hypothesis
results of all views by obtaining the consensus spectral em- is correct. It is observed that, in most of the datasets, the results
beddings of different views. Thus, postprocessing step is still confirmed that the proposed method outperforms the comparing
necessary, which may cause a bad clustering result. The two methods.
multiview subspace clustering methods are both designed to The effectiveness of the proposed MVFCAG has been eval-
construct a consensus anchor graph based on the subspace uated by all the aforementioned experiments. Then, we ex-
theory. FPMVS-CAG can obtain a consensus anchor graph with plore the efficiency of our method by comparing the running
c connected components, while LMVSC still need to conduct time of algorithms. Table VIII shows the time consumption of
the spectral clustering algorithm to achieve the final clusters. the comparing methods including the proposed MVFCAG. It
Therefore, a unified framework, which is also an advantage of can be observed that our method has much shorter running

Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
YU et al.: MULTIVIEW FUZZY CLUSTERING BASED ON ANCHOR GRAPH 763

TABLE III
CLUSTERING ACC ON THE REAL-WORLD DATASETS [MEAN ± STD(%)]

TABLE IV
CLUSTERING NMI ON THE REAL-WORLD DATASETS [MEAN ± STD(%)]

TABLE V
t-TEST FOR CLUSTERING ACC ON THE REAL-WORLD DATASETS

TABLE VI
t-TEST FOR CLUSTERING NMI ON THE REAL-WORLD DATASETS

TABLE VII complexity of our algorithm is quadratic with respect to the


RELEVANT INFORMATION OF THE METHODS IN COMPARISON
number of anchor points. Although the three fast methods have
less running time, their clustering performance is poorer than
the proposed MVFCAG.
In sum, experimental results demonstrates that the proposed
method MVFCAG is not only effective but also efficient for
multiview clustering problems. MVFCAG enjoys compara-
ble computational complexity with existing anchor-based fast
multiview clustering methods and better clustering performance
than the existing graph-based multiview clustering methods.

time than most of the graph-based methods on the datasets with


C. Parameter Sensitivity
more than 2000 samples, especially the largest dataset MNIST,
demonstrating the efficiency of the proposed method. However, In this section, we go through the effects of the number
compared with the three methods, LMVSC, AIMC, and LFA, of anchors m and the regularization parameter λ in detail.
the proposed MVFCAG seems to perform poorly on the large Another parameter for graph construction, the number of nearest
dataset MNIST. It might be due to the fact that the computational neighbors k = 5 is set to be a fixed value. We choose the

Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
764 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024

TABLE VIII
RUNNING TIME OF THE COMPARED ALGORITHMS ON THE REAL-WORLD DATASETS (SECONDS)

Fig. 4. Clustering accuracy of the proposed MVFCAG under the condition of different parameters combinations. (a) MSRC. (b) ORL. (c) Scene. (d) Caltech101-20.
(e) UCI Digits. (f) MNIST.

Fig. 5. Relationship between D, R, clustering accuracy, and the value of λ. The blue bars record the clustering accuracy, the red line records the values of D
(represented as “Difference” in the figure), and the blue line records the value of R (represented as “Regularization” in the figure). (a) MSRC. (b) ORL. (c) Scene.
(d) Caltech101-20. (e) UCI Digits. (f) MNIST.

Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
YU et al.: MULTIVIEW FUZZY CLUSTERING BASED ON ANCHOR GRAPH 765

percentage of anchors from the set {0.05, 0.1, 0.15, 0.2, 0.25} REFERENCES
and the regularization parameter λ from the set {0.25, 0.5, 1, [1] G. Li, D. Song, W. Bai, K. Han, and R. Tharmarasa, “Consensus and
1.25, 1.5, 1.75, 2, 2.25, 2.5}. Then, we record the clustering complementary regularized non-negative matrix factorization for multi-
accuracy on six real-world datasets under the condition of dif- view image clustering,” Inf. Sci., vol. 623, pp. 524–538, 2023.
[2] A. Kumar, P. Rai, and H. Daume, “Co-regularized multi-view spectral
ferent parameters combinations in Fig. 4. As can be seen from clustering,” in Proc. Adv. Neural Inf. Process Syst., 2011, pp. 1413–1421.
the figure, when the number of anchors increases, the clustering [3] A. Kumar and H. Daumé, “A co-training approach for multi-view spectral
accuracy improves. However, the clustering performance will clustering,” in Proc. Int. Conf. Mach. Learn., 2011, pp. 393–400.
[4] J. Liu, C. Wang, J. Gao, and J. Han, “Multi-view clustering via joint
not continually improve with the increase of anchors. In order nonnegative matrix factorization,” in Proc. SIAM Int. Conf. Data Mining,
to make the algorithm more efficient, 15% of anchors will be 2013, pp. 252–260.
enough to obtain an effective clustering result. [5] J. Sun, J. Lu, T. Xu, and J. Bi, “Multi-view sparse co-clustering via
proximal alternating linearized minimization,” in Proc. Int. Conf. Mach.
Nevertheless, the value of the regularization parameter λ Learn., 2015, pp. 757–766.
shows great influence to the accuracy of clustering results. To [6] G. Chao et al., “Multi-view cluster analysis with incomplete data
further explore the effects of λ, and then, design an approach to understand treatment effects,” Inf. Sci., vol. 494, pp. 278–293,
2019.
to choose
 a suitable value, we record two additional values [7] Z. Li, F. Nie, X. Chang, Y. Yang, C. Zhang, and N. Sebe, “Dynamic
D = vi=1 Bi Zi − Y and R = −Y∗ , where D represents affinity graph construction for spectral clustering using multiple features,”
how different the clusters of each view are from the final results IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 12, pp. 6323–6332,
Dec. 2018.
and R represents how clear the clustering is. Fig. 5 shows how [8] R. Zhou, X. Chang, L. Shi, Y.-D. Shen, Y. Yang, and F. Nie, “Person
this two values and clustering accuracy change with regard to reidentification via multi-feature fusion with adaptive graph learning,”
the value of λ. It can be seen that with the increase of λ, the IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 5, pp. 1592–1601,
May 2019.
value of R continuing declines, while the value of D increases, [9] F. Nie, X. Wang, M. I. Jordan, and H. Huang, “The constrained Laplacian
and finally, reaches its maximum and keeps steady. According rank algorithm for graph-based clustering,” in Proc. AAAI Conf. Artif.
to the theoretical analysis, both of the two values need to be Intell., 2016, pp. 1969–1976.
[10] Z. Li, F. Nie, X. Chang, L. Nie, H. Zhang, and Y. Yang, “Rank-constrained
minimized for better clustering results. Therefore, when the spectral clustering with flexible embedding,” IEEE Trans. Neural Netw.
value of R is not small enough or D reaches its maximum, Learn. Syst., vol. 29, no. 12, pp. 6073–6082, Dec. 2018.
the clustering accuracy will significantly decrease. Thus, we can [11] F. Nie et al., “Self-weighted multiview clustering with multiple graphs.,”
in Proc. Int. Joint Conf. Artif. Intell., 2017, pp. 2564–2570.
simply choose a suitable value for λ by continuously increasing [12] X. Li, H. Zhang, R. Wang, and F. Nie, “Multiview clustering: A scalable
it until D reaches its maximum. and parameter-free bipartite graph fusion method,” IEEE Trans. Pattern
Anal. Mach. Intell., vol. 44, no. 1, pp. 330–344, Jan. 2020.
[13] M. Brbić and I. Kopriva, “Multi-view low-rank sparse subspace cluster-
ing,” Pattern Recognit., vol. 73, pp. 247–258, 2018.
[14] Z. Kang, W. Zhou, Z. Zhao, J. Shao, M. Han, and Z. Xu, “Large-scale
V. CONCLUSION multi-view subspace clustering in linear time,” in Proc. AAAI Conf. Artif.
Intell., vol. 34, no. 04, 2020, pp. 4412–4419.
In this article, we present a novel graph-based multiview fuzzy [15] S. Wang et al., “Fast parameter-free multi-view subspace clustering
clustering algorithm that can directly obtain the membership with consensus anchor guidance,” IEEE Trans. Image Process., vol. 31,
matrix without any postprocessing step. An adaptive weight pp. 556–568, Dec. 2021.
[16] S. Huang, Z. Kang, I. W. Tsang, and Z. Xu, “Auto-weighted multi-view
will be introduced to each view during optimization to deal clustering via kernelized graph learning,” Pattern Recognit., vol. 88,
with the unreliable views. The computational complexity of pp. 174–184, 2019.
the algorithm is linear to the number of samples so that it is [17] H. Wang, Y. Yang, and B. Liu, “Gmc: Graph-based multi-view clustering,”
IEEE Trans. Knowl. Data Eng., vol. 32, no. 6, pp. 1116–1129, Jun. 2019.
suitable for large-scale datasets. The advantages of the proposed [18] L. Li and H. He, “Bipartite graph based multi-view clustering,” IEEE
approach are verified both on synthetic datasets and real-world Trans. Knowl. Data Eng., vol. 34, no. 7, pp. 3111–3125, Jul. 2020.
datasets. The experimental results demonstrate that our algo- [19] X. Wang, X. Guo, Z. Lei, C. Zhang, and S. Z. Li, “Exclusivity-consistency
regularized multi-view subspace clustering,” in Proc. IEEE Conf. Comput.
rithm is able to obtain a more reliable membership matrix than Vis. Pattern Recognit., 2017, pp. 923–931.
the previous multiview fuzzy clustering algorithm. Additionally, [20] Y. Liang, D. Huang, C.-D. Wang, and S. Y. Philip, “Multi-view
our algorithm significantly improved the efficiency compared to graph learning by joint modeling of consistency and inconsis-
tency,” IEEE Trans. Neural Netw. Learn. Syst., to be published,
previous graph-based algorithms. However, the algorithm is a doi: 10.1109/TNNLS.2022.3192445.
little bit sensitive to the regularization parameter λ. Although [21] S. Huang, I. W. Tsang, Z. Xu, and J. Lv, “Measuring diversity in graph
we have designed an approach to choose a suitable value for learning: A unified framework for structured multi-view clustering,” IEEE
Trans. Knowl. Data Eng., vol. 34, no. 12, pp. 5869–5883, Dec. 2021.
λ, parameter tuning is still a time-consuming process. How to [22] S. Huang, I. W. Tsang, Z. Xu, and J. Lv, “Cgdd: Multiview graph clustering
adjust the parameter during the optimization will be explored in via cross-graph diversity detection,” IEEE Trans. Neural Netw. Learn.
the future. Additionally, the computational complexity of our Syst., to be published, doi: 10.1109/TNNLS.2022.3201964.
[23] Z. Li, C. Tang, X. Liu, X. Zheng, W. Zhang, and E. Zhu, “Consensus graph
algorithm is quadratic with respect to the number of anchor learning for multi-view clustering,” IEEE Trans. Multimedia, vol. 24,
points. Consequently, we observed during the experiments that pp. 2461–2472, 2021.
our method consumes a long time when dealing with a large [24] X. Shu, X. Zhang, Q. Gao, M. Yang, R. Wang, and X. Gao, “Self-weighted
anchor graph learning for multi-view clustering,” IEEE Trans. Multimedia,
number of anchor points. This is also a topic that merits further to be published, doi: 10.1109/TMM.2022.3193855.
study. Another concern pertains to the requirement of prior [25] T. Xia, D. Tao, T. Mei, and Y. Zhang, “Multiview spectral embed-
knowledge regarding the number of classes, which may not ding,” IEEE Trans. Syst., Man, Cyberne. B, Cybern., vol. 40, no. 6,
pp. 1438–1446, Jun. 2010.
always be available in practical applications. To address this [26] F. Nie et al., “Parameter-free auto-weighted multiple graph learning: A
issue, we will investigate strategies that enable the algorithm to framework for multiview clustering and semi-supervised classification.,”
adaptively adjust the number of clusters. in Proc. Int. Joint Conf. Artif. Intell., 2016, pp. 1881–1887.

Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
766 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024

[27] Y. Li, F. Nie, H. Huang, and J. Huang, “Large-scale multi-view spectral [52] S. Wang et al., “Multi-view clustering via late fusion alignment maximiza-
clustering via bipartite graph,” in Proc. AAAI Conf. Artif. Intell., 2015, tion.,” in Proc. Int. Joint Conf. Artif. Intell., 2019, pp. 3778–3784.
pp. 2750–2756. [53] M.-S. Yang and K. P. Sinaga, “Collaborative feature-weighted multi-
[28] S. Shi, F. Nie, R. Wang, and X. Li, “Fast multi-view clustering via prototype view fuzzy c-means clustering,” Pattern Recognit., vol. 119, 2021,
graph,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 1, pp. 443–455, Art. no. 108064.
Jan. 2021.
[29] F. Nie, L. Tian, and X. Li, “Multiview clustering via adaptively weighted
procrustes,” in Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discov. Data
Mining, 2018, pp. 2022–2030. Weizhong Yu received the Ph.D. degree in electronic
[30] S. Shi, F. Nie, R. Wang, and X. Li, “Auto-weighted multi-view clustering science from Xi’an Research Institute of Hi-Tech,
via spectral embedding,” Neurocomputing, vol. 399, pp. 369–379, 2020. Xi’an, China, in 2012. During 2005 and 2011, he also
[31] Q. Qiang, B. Zhang, F. Wang, and F. Nie, “Fast multi-view discrete studied in the Institute of Microelectronics, Tsinghua
clustering with anchor graphs,” in Proc. AAAI Conf. Artif. Intell., vol. 35, University, Beijing, China for his Ph.D. degree.
no. 11, 2021, pp. 9360–9367. He is currently an Associate Researcher with the
[32] B. Yang, X. Zhang, Z. Lin, F. Nie, B. Chen, and F. Wang, “Efficient and School of Artificial Intelligence, OPtics and Elec-
robust multiview clustering with anchor graph regularization,” IEEE Trans. troNics (iOPEN), Northwestern Polytechnical Uni-
Circuits Syst. Video, Technol., vol. 32, no. 9, pp. 6200–6213, Sep. 2022. versity, Xi’an. His main research interests include
[33] B. Yang, X. Zhang, B. Chen, F. Nie, Z. Lin, and Z. Nan, “Efficient machine learning and computer vision.
correntropy-based multi-view clustering with anchor graph embedding,”
Neural Netw., vol. 146, pp. 290–302, 2022.
[34] S. Shi, F. Nie, R. Wang, and X. Li, “Multi-view clustering via nonnegative
and orthogonal graph reconstruction,” IEEE Trans. Neural Netw. Learn.
Syst., vol. 34, no. 1, pp. 201–214, Jan. 2023.
[35] Z. Hu, F. Nie, R. Wang, and X. Li, “Multi-view spectral clustering via
integrating nonnegative embedding and spectral embedding,” Inf. Fusion, Liyin Xing received the B.S. degree in automation
vol. 55, pp. 251–259, 2020. from Northwestern Polytechnical University, Xi’an,
[36] W. Liu, J. He, and S.-F. Chang, “Large graph construction for scal- China, in 2021. She is currently working toward the
able semi-supervised learning,” in Proc. Int. Conf. Mach. Learn., 2010, M.S. degree in cybersecurity with the School of Cy-
pp. 679–686. bersecurity and the School of Artificial Intelligence,
[37] X. Hu et al., “Multi-view fuzzy classification with subspace clustering and Optics and Electronics (iOPEN), Northwestern Poly-
information granules,” IEEE Trans. Knowl. Data Eng., to be published, technical University. Her research interests include
doi: 10.1109/TKDE.2022.3231929. clustering and its applications.
[38] F. He, F. Nie, R. Wang, H. Hu, W. Jia, and X. Li, “Fast semi-supervised
learning with optimal bipartite graph,” IEEE Trans. Knowl. Data Eng.,
vol. 33, no. 9, pp. 3245–3257, Sep. 2021.
[39] F. Nie, C. Liu, R. Wang, Z. Wang, and X. Li, “Fast fuzzy clustering based
on anchor graph,” IEEE Trans. Fuzzy Syst., vol. 30, no. 7, pp. 2375–2387,
Jul. 2021.
[40] F. Nie, D. Wu, R. Wang, and X. Li, “Truncated robust principle component
analysis with a general optimization framework,” IEEE Trans. Pattern Feiping Nie (Senior Member, IEEE) received the
Anal. Mach. Intell., vol. 44, no. 2, pp. 1081–1097, Feb. 2020. Ph.D. degree in computer science from Tsinghua Uni-
[41] J. M. Steele, The Cauchy-Schwarz Master Class: An Introduction to the Art versity, Beijing, China, in 2009. He is currently a Full
of Mathematical Inequalities. Cambridge, U.K.: Cambridge Univ. Press, Professor with Northwestern Polytechnical Univer-
2004. sity, Xi’an, China. He has authored and co-authored
[42] G. A. Watson, “Characterization of the subdifferential of some matrix more than 100 papers in the following journals and
norms,” Linear Algebra its Appl., vol. 170, no. 1, pp. 33–45, 1992. conferences: IEEE Transactions on Pattern Analy-
[43] J. Huang, F. Nie, and H. Huang, “A new simplex sparse learning model sis and Machine Intelligence (TPAMI), International
to measure data similarity for clustering,” in Proc. Int. Joint Conf. Artif. Journal of Computer Vision (IJCV), IEEE Transac-
Intell., 2015, pp. 3569–3575. tions on Image Processing (TIP), IEEE Transactions
[44] Y. J. Lee and K. Grauman, “Foreground focus: Unsupervised learning from on Neural Networks and Learning Systems (TNNLS),
partially matching images,” Int. J. Comput. Vis., vol. 85, pp. 143–166, IEEE Transactions on Knowledge and Data Engineering (TKDE), International
2009. Conference on Machine Learning (ICML), Conference on Neural Information
[45] G.-S. Xie, X.-B. Jin, Z. Zhang, Z. Liu, X. Xue, and J. Pu, “Retargeted Processing Systems (NIPS), ACM SIGKDD Conference on Knowledge Dis-
multi-view feature learning with separate and shared subspace uncover- covery and Data Mining (KDD), International Joint Conference on Artificial
ing,” IEEE Access, vol. 5, pp. 24895–24907, 2017. Intelligence (IJCAI), Association for the Advancement of Artificial Intelligence
[46] F. S. Samaria and A. C. Harter, “Parameterisation of a stochastic model for (AAAI), International Conference on Computer Vision (ICCV), IEEE/CVF
human face identification,” in Proc. IEEE Workshop Appl. Comput. Vis., Conference on Computer Vision and Pattern Recognition (CVPR), ACM Mul-
1994, pp. 138–142. timedia Conference (ACM MM). His papers have been cited more than 20 000
[47] J. C. Pereira et al., “On the role of correlation and abstraction in cross- times and the H-index is 84. His research interests include machine learning
modal multimedia retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., and its applications, such as pattern recognition, data mining, computer vision,
vol. 36, no. 3, pp. 521–535, Mar. 2013. image processing, and information retrieval. Dr. Nie is currently serving as an
[48] D. Dueck and B. J. Frey, “Non-metric affinity propagation for unsuper- Associate Editor or PC member for several prestigious journals and conferences
vised image categorization,” in Proc. IEEE Int. Conf. Comput. Vis., 2007, in the related fields.
pp. 1–8.
[49] D. Dua and C. Graff, “UCI machine learning repository,” 2017. [Online].
Avalibale: http://archive.ics.uci.edu/ml
[50] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning
applied to document recognition,” vol. 86, no. 11, 1998, pp. 2278–2324. Xuelong Li (Fellow, IEEE) is currently a Full Professor with the School
[51] M.-S. Chen, T. Liu, C.-D. Wang, D. Huang, and J.-H. Lai, “Adaptively-
of Artificial Intelligence, OPtics and ElectroNics, Northwestern Polytechnical
weighted integral space for fast multiview clustering,” in Proc. 30th ACM
University, Xi’an, China.
Int. Conf. Multimedia, 2022, pp. 3774–3782.

Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.

You might also like