Professional Documents
Culture Documents
Multiview Fuzzy Clustering Based On Anchor Graph
Multiview Fuzzy Clustering Based On Anchor Graph
Abstract—With the development of information technology, a each single view, thus achieving better clustering performance
large number of multiview data has emerged, which makes multi- than directly concatenate the features of different views.
view clustering algorithms considerably attractive. Previous graph- Multiview clustering algorithms has been widely used in
based multiview clustering methods usually contain two steps:
obtaining the fusion graph or spectral embedding of all views; image classification [1], natural language processing [2], [3],
and performing clustering algorithms. The two-step process cannot [4], and bioinformatics and health informatics [5], [6]. The
obtain optimal results since the two steps cannot negotiate with graph-based multiview clustering algorithms have also attracted
each other. To address this drawback, a novel algorithm named as lots of interests due to their excellent performance [7], [8]. Many
multi-view fuzzy clustering based on anchor graph is presented. graph-based multiview clustering methods usually first learn a
The proposed method can simultaneously obtain the membership
matrix and minimize the disagreement rates of different views. A fusion graph for all views, and then, conduct an additional clus-
novel regularization based on trace norm is also presented in this tering algorithm on the fusion graph to obtain the final clustering
article, which can not only obtain a clear clustering partition to results. This kind of algorithm can better fuse the multiview
prevent that all samples belonging to each cluster with the same information, but it cannot directly obtain the clustering results.
membership value 1c , but also balance the size of each cluster. More- In order to avoid the two-step process, Nie et al. [9] propose the
over, we exploit the reweighted method to optimize the proposed
model, which can introduce an adaptive weight to each view to deal constrained Laplacian rank (CLR) algorithm, which can obtain
with the unreliable views. A series of experiments are conducted a graph with c connected components directly. The advantages
on different datasets, and the clustering performance verifies the of the CLR has been clarified and widely applied in graph-based
effectiveness and efficiency of the proposed algorithm. clustering algorithms [10]. Nie et al. [11] propose a multi-view
Index Terms—Fuzzy clustering, graph, multiview clustering, clustering algorithm that can adaptively assign weights to each
reweighted optimization framework. view, and then, divided the fusion graph into c components via
CLR algorithm. However, it takes O(n2 c) to optimize the model
I. INTRODUCTION
(where n is the number of samples and c is the number of
LUSTERING algorithm is an important research direction
C in machine learning, whose purpose is to divide samples
with higher similarity into the same category. With the devel-
clusters). In order to accelerate it, Li et al. [12] introduced a
bipartite graph into this framework and proposed an algorithm
that can be applied to large-scale datasets.
opment of information technology, a large number of multiview However, these methods lose some information in the original
data has emerged, such as documents in different languages, data during the fusion. Therefore, in order to better exploit
images, sounds and subtitles in videos, and faces in different the complementary information between multiple views, some
angles and light. How to process and divide these multiview methods integrate graph construction and fusion into a unified
data has become a new problem. Usually, multiview clustering framework, which have attracted widespread attention. Brbić
algorithms use the principle of consensus and complementarity and Kopriva [13] learns a joint subspace representation by con-
to explore the underlying data clustering structure shared be- structing the affinity matrix shared among all views. Although
tween different views, synthesize the different information of these multiview subspace clustering algorithms have a good
clustering performance, they usually suffer from high computa-
Manuscript received 10 March 2023; revised 28 June 2023 and 8 August 2023; tional complexity and cannot apply to large-scale datasets. To
accepted 14 August 2023. Date of publication 18 August 2023; date of current
version 1 March 2024. This work was supported in part by the Key Research
solve this problem, Kang et al. [14] and Wang et al. [15] pro-
and Development Program of Shaanxi under Grant 2023YBGY034, and in part posed several linear complexity algorithms inspired by the idea
by the National Natural Science Foundation of China under Grant 62176212. of anchor graph. In addition to subspace learning, Huang et al.
Recommended by Associate Editor P. D’Urso. (Weizhong Yu and Liyin Xing
contributed equally to this work.) (Corresponding author: Feiping Nie.)
[16] proposed an algorithm that learns similarity relationships in
Weizhong Yu, Feiping Nie, and Xuelong Li are with the School of Artificial kernel spaces. Some scholars also use the adaptive neighborhood
Intelligence, OPtics and ElectroNics (iOPEN), the School of Cybersecurity, method to obtain the similarity matrix of each view, and then,
Northwestern Polytechnical University, Xi’an 710072, China, and also with the
Key Laboratory of Intelligent Interaction and Applications (Northwestern Poly-
fuse them to generate a unified similarity matrix [17]. Li and
technical University), Ministry of Industry and Information Technology, Xi’an He [18] also adapt the strategy but introduce the bipartite graph
710072, China (e-mail: yuwz05@mail.xjtu.edu.cn; feipingnie@gmail.com; into it to accelerate the algorithm. Recently, in order to further
li@nwpu.edu.cn).
Liyin Xing is with the School of Cybersecurity and the School of Artificial
preserve the complementary information and spatial structure
Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical Uni- in different views, Wang et al. [19] and Liang et al. [20] focus
versity, Xi’an 710072, China (e-mail: liyinxing@mail.nwpu.edu.cn). on the representation complementarity between different views
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TFUZZ.2023.3306639.
by introducing an additional exclusivity term. Huang et al.
Digital Object Identifier 10.1109/TFUZZ.2023.3306639 [21] simultaneously detected the multiview consistency and the
1063-6706 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
756 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024
clustering, or simply at random. Let B ∈ Rn×m represent the 3: Update x by x = arg minx∈Ω i T r(DiT gi (x)).
bipartite similarity graph between anchors and samples, where 4: end while
bij is the similarity between the sample xi and the anchor uj . Output: x.
Since each row of the matrix needs to be normalized, we can
calculate B as follows:
K(xi ,uj ) membership matrix of anchors. Thus, the membership matrix of
∀j ∈ Φi data points can be expressed as F = BZ.
s∈Φi K(xi ,us )
bij = (1)
0 j∈/ Φi
C. Reweighted Optimization Framework
where Φi is the set containing the indexes of the k nearest anchors Recently, Nie et al. [40] put forward a novel reweighted
of xi , while K(·) is a kernel function. optimization framework applicable to a range of nonconvex
Rather than using a Gaussian kernel function to calculate the problems efficiently. The formulation of the nonconvex prob-
bipartite graph B, Nie et al. [9] propose an effective adaptive lems that can be solved by the method is shown as follows:
neighbor assignment strategy, and He et al. [38] adopt it to
construct the anchor graph, which can be used to obtain the min hi (gi (x)), s.t. x ∈ Ω (5)
x
ith row of B in the following way: i
m where the composite function hi (gi (x)) is a concave function
min xi − uj 2 bij + γbij 2 (2) of the subfunction gi (x), and x ∈ Ω is the feasible region of x.
(b)i T 1=1,bij ≥0 j=1 i (gi (x))
Let hi (gi (x)) = ∂h∂g i (x)
, then the solution to the problem (5)
where bi T represents the ith row of B, while γ is a parameter based on the reweighted optimization framework is presented
that can be optimized as γ = k2 d(i, k + 1) − 12 kj=1 d(i, j) in Algorithm 1. In this article, we will exploit the reweighted
framework to optimize the proposed model.
according to [9]. d(i, j) = xi − uj 2 represents the distance
between xi and its jth nearest anchor uj . This method is also a
III. METHODOLOGY
nonparametric way to obtain a normalized bipartite graph. The
solution of problem (2) is as follows: A. Motivation
d(i,k+1)−d(i,j) As mentioned in Section II, the membership matrix of each
k ∀j ∈ Φi
bij = kd(i,k+1)− j=1 d(i,j) (3) view can be obtained by Fi = Bi Zi . Thus, the specific cluster-
0 j∈/ Φi ing structure of each view can be easily revealed. In order to get a
consist clustering partition of all views, the membership matrices
where Φi still represents the set containing the indexes of the k
Fi should be close to each other. Therefore, we can introduce
nearest anchors of xi .
Y ∈ Rn×c as the final membership matrix of all views, which
can be obtained by solving the following problem:
B. Anchor-Based Fuzzy Clustering
v
Fuzzy clustering is one of the most popular clustering min Bi Zi − YF
Zi ,Y
algorithms but suffers from high computational complexity. i=1
Nie et al. [39] integrated anchor-based similarity graph construc-
s.t. Zi 1 = 1, Zi 0, Y1 = 1, Y 0 (6)
tion and membership matrix learning into a unified framework
to improve the clustering performance on large-scale datasets. where MF = T r(MT M) denotes the Frobenius norm of the
They designed a quadratic programming model to learn the matrix M.
membership matrix of selected anchors. Then, the anchor graph However, the problem (6) cannot divide the data points clearly
that contains connectivity between data points and anchors can into clusters. We put forward a novel regularization to deal
be exploited to calculate the membership of all the data points. with it in Section III-B, and then, exploit the reweighted frame-
The membership value fij that data point i belongs to the cluster work [40] to introduce an adaptive weight for each view during
j can be calculated by the weighted sum of the membership optimization.
values of all anchors belonging to this cluster, shown as follows:
m B. Multiview Fuzzy Clustering Based on Anchor Graph
fij = bi1 z1j + bi2 z2j + · · · + bim zmj = bil zlj (4) (MVFCAG)
l=1
In order to achieve a clear clustering assignment, we introduce
where m is the number of anchors, zij is the membership value the novel regularization as shown by problem (7), which can not
of the anchor i belonging to cluster j, and Z ∈ Rm×c is the only clearly divide the data points but also balance the size of
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
758 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024
c s.t.Zi 1 = 1, Zi 0, Y1 = 1, Y 0. (14)
c
⇔ Y2∗ c ρi YT Y . (9) C. Optimization
i=1
1) Update Y With the Fixed Zi : In order to address problem
Then, the problem (7) can be transformed into the following (14), we iteratively optimize the variables Zi and Y. When
maximization problem: updating Y with the fixed Zi , we exploit the reweighted
method [40] to deal with the optimization problem. Therefore,
c
max ρi YT Y we need to transform the problem (14) into the same form as
Y1=1,Y0
i=1 problem (5). Thus, we can set
⇔ max T r YT Y gi (Y) = Bi Zi − Y2F , hi (gi (Y)) = (gi (Y)) 2
1
Y1=1,Y0
⎛ ⎞
G (Y) = Y, H (G (Y)) = −λ G (Y)∗ . (15)
n c
⇔ max ⎝ 2⎠
yij . (10)
Y1=1,Y0 Then, the problem (14) can be transformed into
i=1 j=1
v
It is obvious that the problem (10) is independent for each i. min hi (gi (Y)) + H (G (Y)) . (16)
Thus, the problem (10) can be divided into n subproblems. For Y1=1,Y0
i=1
each i, we have
It is obvious that the gradient of hi (gi (Y)) is
c
2
max yij . (11) 1
yi 1=1,yi 0 hi (gi (Y)) = . (17)
j=1 Bi Zi − Y2F
The solution to the maximum problem (11) should be Then, we need to calculate the gradient of the trace norm. If
achieved when yi contains one and only one element equalling the reduced singular value decomposition of Y is UΣVT , it is
to 1 and 0 otherwise, and the maximum value should be 1. There- shown in [42] that the subgradient of Y∗ is
fore, we can conclude that problem (10) reached its maximum
value only when Y is the discrete cluster indicator matrix. ∂ Y∗ = UVT +W : UT W = 0, WV = 0, W2 1 .
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
YU et al.: MULTIVIEW FUZZY CLUSTERING BASED ON ANCHOR GRAPH 759
Here, we present one of the subgradient of Y∗ that is It is obvious that problem (23) is independent for each row of
employed in this article Zi , thus it can be transformed into
∂ Y∗ − 1 2
zi
= Y YT Y 2 min b B − Y
∂Y zi 1=1,zi 0 i r
Zr
T − 12 F
= UVT . (18)
⇔ min T r zTi bTi bi zi + 2zTi bTi (Br Zr − Y)
zi 1=1,zi 0
Thus, we can conclude that one of the subgradient of 2
H(G(Y)) is (Br Zr − Y)T bi
⇔ min zi − (24)
− 1 zi 1=1,zi 0 bTi bi
H (G (Y)) = −λY YT Y 2 . (19) 2
According to Algorithm 1, problem (16) can be transformed where zi denotes the ith row of Zi and bi denotes the ith column
into problem (20) since hi (gi (Y)) and H(G(Y)) are both of Bi .
concave with the subfunction. Problems (22) and (24) share the same formulation, which
can be rewritten as a more compact form
v
min T r (di gi (Y)) + T r −λDT G (Y) 1
Y1=1,Y0
i=1 min x − v22 s.t.xT 1 = 1, x 0 (25)
x 2
v
⇔ min di Bi Zi − Y2F − λT r DT Y (20) where x and v denote any column vectors. This problem have
Y1=1,Y0 been solved by Huang et al. [43], and then, we will present the
i=1
−1
solution. First, we should write the Lagrangian function of the
1
where di = and D = Y(YT Y) 2 . Note that di is
Bi Zi −Y2F problem (25) as
also a adaptive weight for each view, which can distinguish the
1
reliability of different views. L (x, α, β) = x − v22 + α xT 1 − 1 − β T x (26)
2
To solve problem (20), we can rewrite it as follows:
where α and β are the Lagrangian multipliers, both of which are
v
min di Bi Zi − Y2F − λT r DT Y to be determined. Suppose the optimal solution to the minimiza-
Y1=1,Y0
i=1 tion problem (25) is x∗ , the associate Lagrangian multipliers
are α∗ and β ∗ . According to the KKT condition, we have the
v
⇔ min di T r (Bi Zi −Y)T (Bi Zi −Y) −λT r YT D following equations:
Y1=1,Y0
i=1 ⎧ ∗ ∗ ∗
⎪∀j, xj∗ − vj + α − βj = 0
⎪
⎨
(27)
v
∀j, xj 0 (28)
⇔ min di T r YT Y − 2YT Bi Zi − λT r YT D ∗
Y1=1,Y0 ⎪
⎪ ∀j, βj 0 (29)
i=1 ⎩ ∗ ∗
v ∀j, xj βj = 0 (30)
v
⇔ min Tr di Y Y−2Y
T T
diBi Zi −λT r YT D where x∗j is the jth element of vector x∗ , vj is the jth element
Y1=1,Y0
i=1 i=1 of vector v, and βj∗ is the jth element of vector β ∗ . Equation
v 2 (27) can be written as x∗j = vj + βj∗ − α∗ . According to (30),
1
i=1 di Bi Zi + 2 λD
⇔
min Y − v we have βj∗ = 0 or x∗j = 0. Therefore, x∗j = vj − α∗ if βj∗ = 0,
. (21)
Y1=1,Y0 di i=1 F x∗j = 0 if βj∗ = α∗ − vj . And for (28) and (29), we have
v
d B Z + 12 λD
Let M = i=1 i v i dii , then the problem (21) can be vj − α ∗ vj − α ∗ 0
i=1
x∗j =
rewritten as follows: 0 vj − α ∗ < 0
min Y − M2F s.t. Y1 = 1, Y 0 = (vj − α∗ )+ (31)
Y
⇔ min yi − mi 22 s.t. yi 1 = 1, yi 0 (22) where x+ = max(x, 0). According to the constraint xT 1 = 1,
yi
we have
where yi is the ith row of Y and mi is the ith row of M. The
c
problem (22) can be seen as a proximal problem, which will be (vj − α∗ )+ = 1. (32)
discussed in detail later. j=1
2) Update Each Zi With the Fixed Y: When updating each
Zi with the fixed Y, the problem (14) can be transformed into Define a function as f (α) = cj=1 (vj − α)+ − 1, and then,
the following: we can obtain α∗ by solving the root finding problem. Note
that when α∗ 0, f (α) 0 and f (α) is a piecewise linear and
min Bi Zi − Y2F . (23) convex function, we can use the Newton method to find the root
Zi 1=1,Zi 0
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
760 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
YU et al.: MULTIVIEW FUZZY CLUSTERING BASED ON ANCHOR GRAPH 761
Fig. 2. Experimental results for synthetic datasets without noise views. (a) and (d) Two datasets for experiment and different colors indicate the different classes
of the samples. (b) and (e) Anchor graph of each view constructed in advance along with the adaptive weight calculated by the algorithm. (c) and (f) Clustering
results obtained by the proposed algorithm and the different colors point out that which cluster the data point belongs to according to the proposed algorithm.
(a) Data. (b) Graph. (c) Result. (d) Data. (e) Graph. (f) Result.
Fig. 3. Experimental results for synthetic datasets with a noise view. (a) and (d) Two datasets for experiment and different colors indicate the different classes
of the samples. (b) and (e) Anchor graph of each view constructed in advance along with the adaptive weight calculated by the algorithm. (c) and (f) Clustering
results obtained by the proposed algorithm and the different colors point out that which cluster the data point belongs to according to the proposed algorithm.
(a) Data. (b) Graph. (c) Result. (d) Data. (e) Graph. (f) Result.
in Table II. We will provide a simple method to determine the and handwritten digits (UCI Digits [49] and MNIST [50]), as
value of λ by observing the objective function, as discussed in the described in Table I.
upcoming section. To randomize the experiments, each method 3) Baselines: To evaluate the effectiveness and efficiency
is run for 20 times and the average results are recorded in the of the approach proposed in this article, we compare it with
tables. several representative approaches, including four graph-based
2) Datasets: The proposed algorithm and the compared ap- multiview clustering approaches auto-weighted multiple graph
proaches are tested on seven real-world datasets. The real- learning (AMGL) [26], self-weighted multi-view clustering
world datasets include outdoor scene images (MSRC [44] (SwMC) [11], Co-Reg [2], and graph-based multi-view
and scene [45]) face images (ORL [46]), articles collection clustering (GMC) [17]), two multiview subspace clustering
(WikipediaArticles [47]), object images (Caltech101-20 [48]), algorithms fast parameter-free multi-view subspace clustering
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
762 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024
TABLE I
BRIEF INTRODUCTION TO REAL DATASETS
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
YU et al.: MULTIVIEW FUZZY CLUSTERING BASED ON ANCHOR GRAPH 763
TABLE III
CLUSTERING ACC ON THE REAL-WORLD DATASETS [MEAN ± STD(%)]
TABLE IV
CLUSTERING NMI ON THE REAL-WORLD DATASETS [MEAN ± STD(%)]
TABLE V
t-TEST FOR CLUSTERING ACC ON THE REAL-WORLD DATASETS
TABLE VI
t-TEST FOR CLUSTERING NMI ON THE REAL-WORLD DATASETS
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
764 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024
TABLE VIII
RUNNING TIME OF THE COMPARED ALGORITHMS ON THE REAL-WORLD DATASETS (SECONDS)
Fig. 4. Clustering accuracy of the proposed MVFCAG under the condition of different parameters combinations. (a) MSRC. (b) ORL. (c) Scene. (d) Caltech101-20.
(e) UCI Digits. (f) MNIST.
Fig. 5. Relationship between D, R, clustering accuracy, and the value of λ. The blue bars record the clustering accuracy, the red line records the values of D
(represented as “Difference” in the figure), and the blue line records the value of R (represented as “Regularization” in the figure). (a) MSRC. (b) ORL. (c) Scene.
(d) Caltech101-20. (e) UCI Digits. (f) MNIST.
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
YU et al.: MULTIVIEW FUZZY CLUSTERING BASED ON ANCHOR GRAPH 765
percentage of anchors from the set {0.05, 0.1, 0.15, 0.2, 0.25} REFERENCES
and the regularization parameter λ from the set {0.25, 0.5, 1, [1] G. Li, D. Song, W. Bai, K. Han, and R. Tharmarasa, “Consensus and
1.25, 1.5, 1.75, 2, 2.25, 2.5}. Then, we record the clustering complementary regularized non-negative matrix factorization for multi-
accuracy on six real-world datasets under the condition of dif- view image clustering,” Inf. Sci., vol. 623, pp. 524–538, 2023.
[2] A. Kumar, P. Rai, and H. Daume, “Co-regularized multi-view spectral
ferent parameters combinations in Fig. 4. As can be seen from clustering,” in Proc. Adv. Neural Inf. Process Syst., 2011, pp. 1413–1421.
the figure, when the number of anchors increases, the clustering [3] A. Kumar and H. Daumé, “A co-training approach for multi-view spectral
accuracy improves. However, the clustering performance will clustering,” in Proc. Int. Conf. Mach. Learn., 2011, pp. 393–400.
[4] J. Liu, C. Wang, J. Gao, and J. Han, “Multi-view clustering via joint
not continually improve with the increase of anchors. In order nonnegative matrix factorization,” in Proc. SIAM Int. Conf. Data Mining,
to make the algorithm more efficient, 15% of anchors will be 2013, pp. 252–260.
enough to obtain an effective clustering result. [5] J. Sun, J. Lu, T. Xu, and J. Bi, “Multi-view sparse co-clustering via
proximal alternating linearized minimization,” in Proc. Int. Conf. Mach.
Nevertheless, the value of the regularization parameter λ Learn., 2015, pp. 757–766.
shows great influence to the accuracy of clustering results. To [6] G. Chao et al., “Multi-view cluster analysis with incomplete data
further explore the effects of λ, and then, design an approach to understand treatment effects,” Inf. Sci., vol. 494, pp. 278–293,
2019.
to choose
a suitable value, we record two additional values [7] Z. Li, F. Nie, X. Chang, Y. Yang, C. Zhang, and N. Sebe, “Dynamic
D = vi=1 Bi Zi − Y and R = −Y∗ , where D represents affinity graph construction for spectral clustering using multiple features,”
how different the clusters of each view are from the final results IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 12, pp. 6323–6332,
Dec. 2018.
and R represents how clear the clustering is. Fig. 5 shows how [8] R. Zhou, X. Chang, L. Shi, Y.-D. Shen, Y. Yang, and F. Nie, “Person
this two values and clustering accuracy change with regard to reidentification via multi-feature fusion with adaptive graph learning,”
the value of λ. It can be seen that with the increase of λ, the IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 5, pp. 1592–1601,
May 2019.
value of R continuing declines, while the value of D increases, [9] F. Nie, X. Wang, M. I. Jordan, and H. Huang, “The constrained Laplacian
and finally, reaches its maximum and keeps steady. According rank algorithm for graph-based clustering,” in Proc. AAAI Conf. Artif.
to the theoretical analysis, both of the two values need to be Intell., 2016, pp. 1969–1976.
[10] Z. Li, F. Nie, X. Chang, L. Nie, H. Zhang, and Y. Yang, “Rank-constrained
minimized for better clustering results. Therefore, when the spectral clustering with flexible embedding,” IEEE Trans. Neural Netw.
value of R is not small enough or D reaches its maximum, Learn. Syst., vol. 29, no. 12, pp. 6073–6082, Dec. 2018.
the clustering accuracy will significantly decrease. Thus, we can [11] F. Nie et al., “Self-weighted multiview clustering with multiple graphs.,”
in Proc. Int. Joint Conf. Artif. Intell., 2017, pp. 2564–2570.
simply choose a suitable value for λ by continuously increasing [12] X. Li, H. Zhang, R. Wang, and F. Nie, “Multiview clustering: A scalable
it until D reaches its maximum. and parameter-free bipartite graph fusion method,” IEEE Trans. Pattern
Anal. Mach. Intell., vol. 44, no. 1, pp. 330–344, Jan. 2020.
[13] M. Brbić and I. Kopriva, “Multi-view low-rank sparse subspace cluster-
ing,” Pattern Recognit., vol. 73, pp. 247–258, 2018.
[14] Z. Kang, W. Zhou, Z. Zhao, J. Shao, M. Han, and Z. Xu, “Large-scale
V. CONCLUSION multi-view subspace clustering in linear time,” in Proc. AAAI Conf. Artif.
Intell., vol. 34, no. 04, 2020, pp. 4412–4419.
In this article, we present a novel graph-based multiview fuzzy [15] S. Wang et al., “Fast parameter-free multi-view subspace clustering
clustering algorithm that can directly obtain the membership with consensus anchor guidance,” IEEE Trans. Image Process., vol. 31,
matrix without any postprocessing step. An adaptive weight pp. 556–568, Dec. 2021.
[16] S. Huang, Z. Kang, I. W. Tsang, and Z. Xu, “Auto-weighted multi-view
will be introduced to each view during optimization to deal clustering via kernelized graph learning,” Pattern Recognit., vol. 88,
with the unreliable views. The computational complexity of pp. 174–184, 2019.
the algorithm is linear to the number of samples so that it is [17] H. Wang, Y. Yang, and B. Liu, “Gmc: Graph-based multi-view clustering,”
IEEE Trans. Knowl. Data Eng., vol. 32, no. 6, pp. 1116–1129, Jun. 2019.
suitable for large-scale datasets. The advantages of the proposed [18] L. Li and H. He, “Bipartite graph based multi-view clustering,” IEEE
approach are verified both on synthetic datasets and real-world Trans. Knowl. Data Eng., vol. 34, no. 7, pp. 3111–3125, Jul. 2020.
datasets. The experimental results demonstrate that our algo- [19] X. Wang, X. Guo, Z. Lei, C. Zhang, and S. Z. Li, “Exclusivity-consistency
regularized multi-view subspace clustering,” in Proc. IEEE Conf. Comput.
rithm is able to obtain a more reliable membership matrix than Vis. Pattern Recognit., 2017, pp. 923–931.
the previous multiview fuzzy clustering algorithm. Additionally, [20] Y. Liang, D. Huang, C.-D. Wang, and S. Y. Philip, “Multi-view
our algorithm significantly improved the efficiency compared to graph learning by joint modeling of consistency and inconsis-
tency,” IEEE Trans. Neural Netw. Learn. Syst., to be published,
previous graph-based algorithms. However, the algorithm is a doi: 10.1109/TNNLS.2022.3192445.
little bit sensitive to the regularization parameter λ. Although [21] S. Huang, I. W. Tsang, Z. Xu, and J. Lv, “Measuring diversity in graph
we have designed an approach to choose a suitable value for learning: A unified framework for structured multi-view clustering,” IEEE
Trans. Knowl. Data Eng., vol. 34, no. 12, pp. 5869–5883, Dec. 2021.
λ, parameter tuning is still a time-consuming process. How to [22] S. Huang, I. W. Tsang, Z. Xu, and J. Lv, “Cgdd: Multiview graph clustering
adjust the parameter during the optimization will be explored in via cross-graph diversity detection,” IEEE Trans. Neural Netw. Learn.
the future. Additionally, the computational complexity of our Syst., to be published, doi: 10.1109/TNNLS.2022.3201964.
[23] Z. Li, C. Tang, X. Liu, X. Zheng, W. Zhang, and E. Zhu, “Consensus graph
algorithm is quadratic with respect to the number of anchor learning for multi-view clustering,” IEEE Trans. Multimedia, vol. 24,
points. Consequently, we observed during the experiments that pp. 2461–2472, 2021.
our method consumes a long time when dealing with a large [24] X. Shu, X. Zhang, Q. Gao, M. Yang, R. Wang, and X. Gao, “Self-weighted
anchor graph learning for multi-view clustering,” IEEE Trans. Multimedia,
number of anchor points. This is also a topic that merits further to be published, doi: 10.1109/TMM.2022.3193855.
study. Another concern pertains to the requirement of prior [25] T. Xia, D. Tao, T. Mei, and Y. Zhang, “Multiview spectral embed-
knowledge regarding the number of classes, which may not ding,” IEEE Trans. Syst., Man, Cyberne. B, Cybern., vol. 40, no. 6,
pp. 1438–1446, Jun. 2010.
always be available in practical applications. To address this [26] F. Nie et al., “Parameter-free auto-weighted multiple graph learning: A
issue, we will investigate strategies that enable the algorithm to framework for multiview clustering and semi-supervised classification.,”
adaptively adjust the number of clusters. in Proc. Int. Joint Conf. Artif. Intell., 2016, pp. 1881–1887.
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.
766 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 32, NO. 3, MARCH 2024
[27] Y. Li, F. Nie, H. Huang, and J. Huang, “Large-scale multi-view spectral [52] S. Wang et al., “Multi-view clustering via late fusion alignment maximiza-
clustering via bipartite graph,” in Proc. AAAI Conf. Artif. Intell., 2015, tion.,” in Proc. Int. Joint Conf. Artif. Intell., 2019, pp. 3778–3784.
pp. 2750–2756. [53] M.-S. Yang and K. P. Sinaga, “Collaborative feature-weighted multi-
[28] S. Shi, F. Nie, R. Wang, and X. Li, “Fast multi-view clustering via prototype view fuzzy c-means clustering,” Pattern Recognit., vol. 119, 2021,
graph,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 1, pp. 443–455, Art. no. 108064.
Jan. 2021.
[29] F. Nie, L. Tian, and X. Li, “Multiview clustering via adaptively weighted
procrustes,” in Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discov. Data
Mining, 2018, pp. 2022–2030. Weizhong Yu received the Ph.D. degree in electronic
[30] S. Shi, F. Nie, R. Wang, and X. Li, “Auto-weighted multi-view clustering science from Xi’an Research Institute of Hi-Tech,
via spectral embedding,” Neurocomputing, vol. 399, pp. 369–379, 2020. Xi’an, China, in 2012. During 2005 and 2011, he also
[31] Q. Qiang, B. Zhang, F. Wang, and F. Nie, “Fast multi-view discrete studied in the Institute of Microelectronics, Tsinghua
clustering with anchor graphs,” in Proc. AAAI Conf. Artif. Intell., vol. 35, University, Beijing, China for his Ph.D. degree.
no. 11, 2021, pp. 9360–9367. He is currently an Associate Researcher with the
[32] B. Yang, X. Zhang, Z. Lin, F. Nie, B. Chen, and F. Wang, “Efficient and School of Artificial Intelligence, OPtics and Elec-
robust multiview clustering with anchor graph regularization,” IEEE Trans. troNics (iOPEN), Northwestern Polytechnical Uni-
Circuits Syst. Video, Technol., vol. 32, no. 9, pp. 6200–6213, Sep. 2022. versity, Xi’an. His main research interests include
[33] B. Yang, X. Zhang, B. Chen, F. Nie, Z. Lin, and Z. Nan, “Efficient machine learning and computer vision.
correntropy-based multi-view clustering with anchor graph embedding,”
Neural Netw., vol. 146, pp. 290–302, 2022.
[34] S. Shi, F. Nie, R. Wang, and X. Li, “Multi-view clustering via nonnegative
and orthogonal graph reconstruction,” IEEE Trans. Neural Netw. Learn.
Syst., vol. 34, no. 1, pp. 201–214, Jan. 2023.
[35] Z. Hu, F. Nie, R. Wang, and X. Li, “Multi-view spectral clustering via
integrating nonnegative embedding and spectral embedding,” Inf. Fusion, Liyin Xing received the B.S. degree in automation
vol. 55, pp. 251–259, 2020. from Northwestern Polytechnical University, Xi’an,
[36] W. Liu, J. He, and S.-F. Chang, “Large graph construction for scal- China, in 2021. She is currently working toward the
able semi-supervised learning,” in Proc. Int. Conf. Mach. Learn., 2010, M.S. degree in cybersecurity with the School of Cy-
pp. 679–686. bersecurity and the School of Artificial Intelligence,
[37] X. Hu et al., “Multi-view fuzzy classification with subspace clustering and Optics and Electronics (iOPEN), Northwestern Poly-
information granules,” IEEE Trans. Knowl. Data Eng., to be published, technical University. Her research interests include
doi: 10.1109/TKDE.2022.3231929. clustering and its applications.
[38] F. He, F. Nie, R. Wang, H. Hu, W. Jia, and X. Li, “Fast semi-supervised
learning with optimal bipartite graph,” IEEE Trans. Knowl. Data Eng.,
vol. 33, no. 9, pp. 3245–3257, Sep. 2021.
[39] F. Nie, C. Liu, R. Wang, Z. Wang, and X. Li, “Fast fuzzy clustering based
on anchor graph,” IEEE Trans. Fuzzy Syst., vol. 30, no. 7, pp. 2375–2387,
Jul. 2021.
[40] F. Nie, D. Wu, R. Wang, and X. Li, “Truncated robust principle component
analysis with a general optimization framework,” IEEE Trans. Pattern Feiping Nie (Senior Member, IEEE) received the
Anal. Mach. Intell., vol. 44, no. 2, pp. 1081–1097, Feb. 2020. Ph.D. degree in computer science from Tsinghua Uni-
[41] J. M. Steele, The Cauchy-Schwarz Master Class: An Introduction to the Art versity, Beijing, China, in 2009. He is currently a Full
of Mathematical Inequalities. Cambridge, U.K.: Cambridge Univ. Press, Professor with Northwestern Polytechnical Univer-
2004. sity, Xi’an, China. He has authored and co-authored
[42] G. A. Watson, “Characterization of the subdifferential of some matrix more than 100 papers in the following journals and
norms,” Linear Algebra its Appl., vol. 170, no. 1, pp. 33–45, 1992. conferences: IEEE Transactions on Pattern Analy-
[43] J. Huang, F. Nie, and H. Huang, “A new simplex sparse learning model sis and Machine Intelligence (TPAMI), International
to measure data similarity for clustering,” in Proc. Int. Joint Conf. Artif. Journal of Computer Vision (IJCV), IEEE Transac-
Intell., 2015, pp. 3569–3575. tions on Image Processing (TIP), IEEE Transactions
[44] Y. J. Lee and K. Grauman, “Foreground focus: Unsupervised learning from on Neural Networks and Learning Systems (TNNLS),
partially matching images,” Int. J. Comput. Vis., vol. 85, pp. 143–166, IEEE Transactions on Knowledge and Data Engineering (TKDE), International
2009. Conference on Machine Learning (ICML), Conference on Neural Information
[45] G.-S. Xie, X.-B. Jin, Z. Zhang, Z. Liu, X. Xue, and J. Pu, “Retargeted Processing Systems (NIPS), ACM SIGKDD Conference on Knowledge Dis-
multi-view feature learning with separate and shared subspace uncover- covery and Data Mining (KDD), International Joint Conference on Artificial
ing,” IEEE Access, vol. 5, pp. 24895–24907, 2017. Intelligence (IJCAI), Association for the Advancement of Artificial Intelligence
[46] F. S. Samaria and A. C. Harter, “Parameterisation of a stochastic model for (AAAI), International Conference on Computer Vision (ICCV), IEEE/CVF
human face identification,” in Proc. IEEE Workshop Appl. Comput. Vis., Conference on Computer Vision and Pattern Recognition (CVPR), ACM Mul-
1994, pp. 138–142. timedia Conference (ACM MM). His papers have been cited more than 20 000
[47] J. C. Pereira et al., “On the role of correlation and abstraction in cross- times and the H-index is 84. His research interests include machine learning
modal multimedia retrieval,” IEEE Trans. Pattern Anal. Mach. Intell., and its applications, such as pattern recognition, data mining, computer vision,
vol. 36, no. 3, pp. 521–535, Mar. 2013. image processing, and information retrieval. Dr. Nie is currently serving as an
[48] D. Dueck and B. J. Frey, “Non-metric affinity propagation for unsuper- Associate Editor or PC member for several prestigious journals and conferences
vised image categorization,” in Proc. IEEE Int. Conf. Comput. Vis., 2007, in the related fields.
pp. 1–8.
[49] D. Dua and C. Graff, “UCI machine learning repository,” 2017. [Online].
Avalibale: http://archive.ics.uci.edu/ml
[50] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning
applied to document recognition,” vol. 86, no. 11, 1998, pp. 2278–2324. Xuelong Li (Fellow, IEEE) is currently a Full Professor with the School
[51] M.-S. Chen, T. Liu, C.-D. Wang, D. Huang, and J.-H. Lai, “Adaptively-
of Artificial Intelligence, OPtics and ElectroNics, Northwestern Polytechnical
weighted integral space for fast multiview clustering,” in Proc. 30th ACM
University, Xi’an, China.
Int. Conf. Multimedia, 2022, pp. 3774–3782.
Authorized licensed use limited to: UNIVERSITE Cote d'Azur (Nice). Downloaded on March 12,2024 at 14:50:51 UTC from IEEE Xplore. Restrictions apply.