s13278-024-01246-5

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

Social Network Analysis and Mining (2024) 14:93

https://doi.org/10.1007/s13278-024-01246-5

REVIEW PAPER

A comprehensive survey on community detection methods


and applications in complex information networks
Abdelhani Diboune1 · Hachem Slimani2 · Hassina Nacer1 · Kadda Beghdad Bey3

Received: 14 November 2023 / Revised: 14 February 2024 / Accepted: 19 March 2024


© The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature 2024

Abstract
This paper extensively reviews the literature of community detection in complex networks and proposes a general classifica-
tion describing the main models used for this purpose. Besides, a statistical study of the distribution of the recent relevant
literature has been realized to picture the tendency of the models used by the main works published in the context of com-
munity detection. This mainly helped the understanding of the suitable community model to be used in each real-world
network application. Furthermore, we establish a critical study of the state-of-the-art approaches according to the proposed
classification. Moreover, we investigate the relevant applications of communities in networks and we establish a statistical
study to illustrate the distribution of research works in the field of community detection. Finally, we discuss several open
issues and future research directions of approaches and applications that would be worth investigating in the area of com-
munity detection.

Keywords Network community detection · Graph clustering · Machine learning techniques · Multi-objective optimization ·
Game theoretic approaches · Networks applications

1 Introduction edge set (connections) represents relations or interactions


such as: friendship, trust, co-citations, etc. Even though the
Complex information networks are dynamic topological graph-based structures of many of these Complex informa-
structures which give an accurate modelization of many tion networks are very complex and show no trivial topol-
complex systems implying one or many specific interac- ogy, they generally share some common global statistical
tions or relations between their sub-components. The node features (Boccaletti et al. 2006): the small world property,
set in a complex network represents the components of the the scale-free degree distribution and a high-clustering coef-
system: profiles, web pages, web services, etc. while the ficient. The small world property means that the distance
between any two nodes in the network increases logarithmi-
cally with the network size, the connectivity distribution in
* Abdelhani Diboune
adiboune@usthb.dz complex networks is characterized by a power law. Finally,
complex network has a greater clustering coefficient that
Hachem Slimani
hachem.slimani@univ-bejaia.dz defines the fraction of connected triples of nodes which also
form cliques. According to their application fields, complex
Hassina Nacer
sino_nacer@yahoo.fr networks can be divided into four categories (Newman
2010): social networks, technological networks, biological
Kadda Beghdad Bey
k.beghdadbey@gmail.com networks and information networks (which include online
social networks). This paper studies the case of complex
1
MOVEP Laboratory, USTHB, BP 32, 16111 Bab Ezzouar, information networks.
Algiers, Algeria Recently, there has been renewed interest in complex
2
LIMED Laboratory, Faculty of Exact Sciences, University information network analysis. This is primarily due to the
of Bejaia, 06000 Bejaia, Algeria quick development of networks and their broad applica-
3
Informatics Systems Laboratory, Ecole Militaire bility in many academic and industrial domains ranging
Polytechnique, BP 17, 16046 Bordj El Bahri, Algiers, from recommender systems, friend recommendation, Sybil
Algeria

Vol.:(0123456789)
93 Page 2 of 47 Social Network Analysis and Mining (2024) 14:93

defense, to web research. On the other hand, in spite of 2 Related work and contributions
the large and increasing amount of literature on network
analysis, the scalability and time complexity of the net- 2.1 Existing surveys in the literature
work analysis algorithms remain undecided issues. In this
context, finding a mesoscopic scale structure of the inter- Many relevant surveys explored the field of community
connection network can be used as a solution for compu- detection from various perspectives. In this setting, Fortu-
tational complexity of algorithms in large networks (For- nato (2010) considered community depiction in undirected
tunato 2010). Indeed, automatic discovery of communities interconnection graph from the standpoint of statistical
could be an adequate solution for revealing functionally physics. In addition to that, the author discussed some cru-
associated coalitions in the network and providing a way cial issues such as the importance of clustering and how
of reducing time complexity of many applications by tar- the different approaches should be tested. Furthermore,
geting only relevant clusters. he described some applications of community detection
Many researchers have studied the community detection to real networks. Additionally, Nascimento and Carvalho
problem and proposed many approaches based on various (2011) discussed various graph clustering approaches to
methods such as spectral clustering (Nascimento and Car- detect communities which are based on graph cut, graph
valho 2011), game theory (Jonnalagadda and Kuppusamy partitioning, and spectral clustering algorithms. Moreover,
2016), evolutionary computation (Pizzuti 2018), etc. How- Plantié and Crampes (2013) proposed an original survey on
ever, the notion of community remains ill-defined and till community detection, covering both semantics and cluster-
now, there is no unique formal consensual definition of ing results of the different approaches used. Furthermore,
network communities. In fact, according to their applica- it presents three main analytical approaches for detecting
tion fields, researchers have different assumptions to for- communities that include the main methods in literature.
mally define communities (Fortunato 2010; Jonnalagadda The first approach, considers the social network as a graph
and Kuppusamy 2016). Furthermore, community detec- and then analyzes its structure. The second approach
tion is an NP-hard problem, which means that even though associates the network with a hypergraph and inspects its
in many cases, it could be solved through an exhaustive structure. Finally, the third approach uses the properties
search, the required running time is forbiddingly very large of concept lattices in order to study the network structure.
(Hämäläinen 2006). Hence, on this scope, there has been Furthermore, Jonnalagadda and Kuppusamy (2016) pro-
a growing interest in developing and analyzing less time- posed a survey that describes a taxonomy of game models
consuming algorithms in the literature. and their characteristics as well as their performance. They
This paper intends to survey the state of the art of com- discussed the interesting applications of game theory for
munity detection in complex networks. In comparison to the community detection in social networks and gave future
existing works, this paper aims to interpret the underlying research directions in this context. On the other hand, Piz-
communities in complex information networks according to zuti (2018) provided a complete overview of methods used
their characteristics and application domains. In this setting, for community detection based on evolutionary computa-
the main aim is to determine for each situation what is the tion. More specifically, her survey focused on methods
suitable mathematical model to be used for describing com- based on Genetic Algorithms and Evolutionary strategies
munities. Another purpose is to develop a statistical study in general, as well as other Nature-inspired approaches for
which depicts the distribution of research works in the area finding communities. These approaches are applied in vari-
of community detection in real-world networks. ous types of complex interconnection networks, including
The reminder of this paper is organized as follows: the undirected, directed, and ad hoc networks. Moreover, Jin
next section outlines the main related works and contribu- et al. (2021) proposed a comprehensive survey of commu-
tions. Section 3 presents our research methodology adopted nity detection approaches in the case of a bloc stochastic
during the elaboration of our survey. Section 4 describes our model of communities. In this scope, they distinguished
classification proposal of the existing community models two main classes of approaches: Probabilistic graphical
in the literature. Sections 5–8, present for every model a model based approaches which are subdivided into three
review and a comparative study according to well-chosen subclasses depending on the type of probabilistic graphical
criteria of the main community detection approaches used models which are dealt with (directed, undirected cases),
in literature. In Sect. 9, we describe the different network and Deep learning-based methods which include many
application domains that use underlying communities as an Deep learning strategies: Auto-encoder-based, Generative
additional dimension to enhance their performance. Sec- adversarial network based, Graph convolutional network
tion 10 discusses open issues while presenting some inter- based, and integrating Graph convolutional networks and
esting research directions which may be worth investigating Undirected graphical models. Besides, Su et al. (2022)
as future work. Finally, Sect. 11 concludes the paper.
Social Network Analysis and Mining (2024) 14:93 Page 3 of 47 93

proposed a new taxonomy that covers different community • It presents a more general classification based on the
detection methods based on deep learning. Their classifica- main mathematical models used to describe communi-
tion includes Convolutional networks, Graph attention net- ties in complex information networks. This classification
works, Generative adversarial networks, and Autoencoders. is supported by a statistical study of the main distribu-
The survey then discusses some practical applications of tion of WoS-indexed relevant literature. This leads us to
community detection in many domains and suggests some realize a critical study that gives us broader picture and
implementation scenarios. insights about models and approaches used in literature
in the context of community detection.
• It establishes a critical comparative study of the state-
2.2 Contributions of the survey of-the-art approaches according to the proposed classi-
fication. This allows us to extract and visualize the main
Even though these existing surveys (Fortunato 2010; Nasci- characteristics as well as cons and pros of the models on
mento and Carvalho 2011; Plantié and Crampes 2013; Jon- which our classification is based.
nalagadda and Kuppusamy 2016; Fortunato and Hric 2016; • It addresses the difficulty of interpreting communities in
Pizzuti 2018; Jin et al. 2021; Su et al. 2022) treated commu- complex information networks based on their application
nity detection in much detail, their studies are limited only areas. This helps us to determine the accurate community
to one or two models as shown in Table 1. Thus, it would model to be used according to the network applications.
be interesting to study a broader taxonomy of community • It provides a statistical study that shows the distribution
detection approaches that allows to include the main differ- of the relevant search works in the field of community
ent models in literature, depicts research trends, establishes detection in real-world complex networks. This illustrates
more extensive comparative analysis, and shows their pros us the tendency of community detection utilization in
and cons. Furthermore, the applicability of communities real world applications and by consequence, the eventual
in real applications has not been given sufficient attention, lacuna in this context.
which in our opinion one of the most important issues in
community detection field. Moreover, it would be worth to
investigate the relationships between these applications and
the main models. To overcome these lacks, we produce in
this paper an extensive research survey dedicated to the com- 3 Research methodology
munity detection in complex networks where we investigate
essentially more theoretical models of communities and their In order to investigate the main mathematical models of
applicability to real network applications. More specifically, communities used in literature as well as the common appli-
the main contributions of our work are illustrated in Table 1 cations of communities in complex information networks,
and summarized as follows: we relied on a Systematic Literature Review (SLR) (Okoli

Table 1  Comparison with some existing surveys on community detection


Existing surveys Models of works reviewed Network applications Statistical study

Fortunato (2010) Graph model Biological functional associations, Friend com- No


munities
Nascimento and Carvalho (2011) Graph model Not investigated No
Plantié and Crampes (2013) Graph model Not investigated No
Jonnalagadda and Kuppusamy (2016) Game model Not investigated No
Fortunato and Hric (2016) Graph model Biological functional associations, Friend com- No
munities
Pizzuti (2018) MOO model Not investigated No
Jin et al. (2021) Graph model (Stochastic), ML model Online social network analysis, Image under- No
standing, Neuro-science
Su et al. (2022) ML model Collaborative filtering, Biochemistry, Sybil No
defense, Online social networks analysis,
Community deception, Community search
Our survey Graph model, ML model, MOO Collaborative filtering, Friend recommendation, Yes
model, Game model Sybil defense, Web communities, Service
composition
93 Page 4 of 47 Social Network Analysis and Mining (2024) 14:93

and Schabram 2015). Thus, all substantial research works initially selected were carefully read in order to judge
published about these two fields were collected from Google if the community notion is relatively a substantial part
Scholar and many electronic scientific databases (Science- of the articles contents, and to study the type of net-
Direct, SpringerLink, IEEE Xplore, etc.). For this purpose, work applications used. The most obvious findings that
general keywords were used to retrieve a prior set of research emerged from this first investigation is that communities
papers such as: Communities in complex networks, coali- are used in five main applications: Collaborative filter-
tions in complex networks, graph clustering in complex ing, Friend recommendation, Sybil defense, Web com-
networks, etc. munities, and Web service composition communities.
After a careful study of the titles, abstracts, keywords, As relevant literature was scarcer in the context of com-
introductions and main contributions of the obtained set of munity application, a broader time interval was needed
research papers, all irrelevant works were excluded. As a to collect a larger population of relevant high quality
result of this first step, we have noticed that the collected research works. Indeed, to enrich the search, based on
research works fall into two main headings: the preliminary results, more specialized keywords were
used such as: Social community based collaborative fil-
• The first heading consists of researches which concen- tering, Profile clustering and friend recommender, Graph
trate on the investigation of mathematical models of cut and Sybil defense, web community detection, com-
network communities and the development of accurate munity based service composition, and so forth. This
methods to detect these communities. Nevertheless, they extended set of articles (94 papers) was used to elaborate
do not deal with the real world applications of commu- a synthetic statistic study that reveals the tendency of
nities. The selected articles from the previous step that published works. In this scope, the main relevant works
fall in this first class were fully read, and a first gen- published since 2010 have been censused and carefully
eral classification was established. As result, we noticed analyzed. The results are discussed in Sect. 9.
four basic models being adopted in literature which are:
Graph model, Machine learning model, Multi-objective Table 2 illustrates an overview of the distribution of different
optimization model, and Game model. Furthermore, we source categories used in our statistical studies of the two
refined every class of community models by searching headings: community models (including the resources that
more specific keywords such as: Graph clustering for were used in classification study but not used in statistical
community detection, Graph partitioning, Supervised/ study) and community applications.
Unsupervised learning for community detection, Multi-
objective optimization for community detection, Coop-
erative game for community detection, etc. After that,
a deepen analysis of methodologies and contributions 4 Classification proposal and distribution
described in these papers allowed us to deduce mainly of research works relative to community
the essential frameworks and the approaches used for detection approaches in complex
every model. In this context, the different models and networks
frameworks recorded are discussed in Sects. 4, 5, 6, 7
and 8. The final stage consists of the study of the cur- In this section, we propose a broader classification of the
rent trend of community modelization used in high qual- state-of-the-art of community detection approaches in com-
ity scientific literature. Thus, all works which were not plex networks. Then, we establish and discuss a distribution
journal articles published in WoS-indeed journals were
excluded from the study. Moreover, all works anterior
of the year 2018 were not considered. The remaining
research works (113 journal articles) were classified
according to the models, frameworks and approaches Table 2  Distribution of different resource categories used in the clas-
used, the results are discussed in Sect. 4.2. sifications studies of our survey
• The second heading consists of research works that use Resource categories Community models Community
community structures in order to enhance existing net- applications
work applications. Generally, these studies focus neither
Journal articles 127 54
on proposing an approach to accurately detect communi-
Conference papers 13 40
ties, nor on proposing an optimized model to represent
Book chapters 0 0
communities. However, they just use community struc-
Total 140 94
tures as an additional information to improve existing
Total = 234
network applications. As in the first heading, the papers
Social Network Analysis and Mining (2024) 14:93 Page 5 of 47 93

of recent WoS-indexed relevant literature according to this • What stands out in Figs. 2 and 3 is the predominance
classification. of Machine learning model in the field of community
detection in complex information networks. With a
4.1 Classification proposal total percentage of about 51%, we remark that the ratio
of works published in the last few years increased rap-
To date, there is no general-consented definition of commu- idly (from 42 to 59% in 6 years).
nity in networks, and many models were introduced in litera- • We notice that there has been a sharp fall in the ratio of
ture to describe some desirable mathematical characteristics research works using Multi-objective model (from 47 to
in communities (Fortunato 2010). This section examines the 25% ). This is mainly due to the mitigation toward deep
main mathematical models of communities in complex net- learning solutions (an approach of the Machine learn-
works which are used in literature relative to community ing model) which generally ensure a better accuracy
detection issue. In this setting, we have limited our study to than Multi-objective models which is mainly based on
the formal theoretical approaches used to modele communi- Metaheuristics approaches (see Table 3).
ties in networks without considering their applicability in • It is apparent from the chart of Fig. 3 that relatively
real information networks. Note that Sect. 10 of this work less abundant works using Graph model have been pub-
is dedicated to this concern where the main applications lished in the last six years (little more than 12%). This
of community detection are categorized and discussed. In is mainly due to the high time complexity of graph
addition to that, a statistical study and a classification of the approaches. In addition to that, graph models have dif-
relevant research works according to the applicability aspect ficulties to consider for clustering purpose all node fea-
is established. tures, which are very relevant information for commu-
In the context of community detection approaches, our nity detection in real complex networks applications.
classification proposal is organized into three main levels as • Works on Game theory model are mostly preliminary
shown in Fig. 1. More specifically, each level is described attempts and needs further investigations in order to be
as follows: more adapted to real large-scale network applications.

1. The first level is dedicated to the types of models that Following the proposed classification given in Fig. 1, in
picture the communities. In this context, we consider the next four sections, we describe the different models
four main models: Graph model, Machine-learning constituting the first level of the classification. Then, for
model, Multi-objective optimization model, and Game each model, we discuss the associated frameworks, the
theory model. relevant approaches used in each framework, and we estab-
2. The second level refers, for each model, to the kind of lish a comparative study of some relevant research works
frameworks followed for detecting these communities in according to the following criteria:
complex networks. In this context, Graph model includes
three main frameworks: Pattern detection, Spectral clus- ✓ 
D etected community types: describes whether the
tering, and Stochastic approach. Furthermore, Machine detected communities are overlapping communities or
learning model considers: Unsupervised learning, Semi- Partitions (no-overlapping communities).
supervised learning, and Supervised learning. On the
other hand, Multi-Objective Optimization model takes
into account: Hierarchical, and Meta-heuristic frame- ✓ 
Distribution: describes the computation type adopted by
works. Finally, Game theoretic model is divided into two the algorithm, i.e. distributed (parallel computation on
main frameworks: Non-cooperative games and Coopera- multiprocessing units), centralized (sequential computa-
tive games. tion on a mono-processing unit), parallelizable (central-
3. The third level lists the associated approaches used to ized sequential computations, however, the algorithm
solve the community detection issue for every frame- could be parallelized).
work.
✓ 
Time complexity of the community detection algorithm
4.2 Distribution of WoS‑indexed relevant literature used.

Following our proposed classification, we have categorized ✓ 


Scalability: says whether the scalability issue is consid-
more than one hundred relevant research works relative to ered or not while developing the approach.
community models and frameworks in complex information
networks published in Web of Science journals during the ✓ 
Application of the approach in complex networks.
last six years as shown in Table 3 and Fig. 2.
93 Page 6 of 47 Social Network Analysis and Mining (2024) 14:93

Fig. 1  The proposed classifica-


tion of community detection
approaches
Table 3  Distribution of WoS-indexed relevant literature related to community models and frameworks in complex information networks published from January 2018 to June 2023
Models Frameworks 2018–2019 2020–2021 2022–June 2023

Graph model Pattern – Zhu et al. (2020), Ramesh and Srivatsun Yang et al. (2022)
(2021)
Social Network Analysis and Mining

Spectral Gui et al. (2018); Shi et al. (2019) Taştan et al. (2021) Francisquini et al. (2022)
Stochastic Li et al. (2018) Yi et al. (2021), Wu et al. (2021) Li et al. (2022), Contisciani et al. (2022),
Chen and Mo (2022), Okamoto and Qiu
(2022)
Machine learning model Unsupervised Wu et al. (2018), Sattari and Zamani- Wang et al. (2020), Yan and Chang (2020), Roozbahani et al. (2023), Fang and Lin
(2024) 14:93

far (2018b), Lei et al. (2019), Jin et al. Zhang et al. (2020), Lu et al. (2020), (2022), Luo and Xu (2022), Shang et al.
(2019), Sattari and Zamanifar (2018a), Nath et al. (2021), Roghani et al. (2021), (2023), Fang et al. (2022), Zhao et al.
Duan et al. (2019), Xie et al. (2019), Bouyer and Roghani (2020), Xu et al. (2022), Liu et al. (2023), Traag and Šubelj
Deng et al. (2019), Li (2019), Ding et al. (2020), Li et al. (2021), Rostami and Ous- (2023), Laassem et al. (2022), Gholami
(2018), Cao et al. (2018), He et al. (2019), salah (2022), Malhotra and Chug (2021), et al. (2022), Niu et al. (2023), Al-Andoli
Yan and Chang (2019) Wang et al. (2021), Huang et al. (2021), et al. (2022), Chen et al. (2022), Salha-
Zhang and Zhou (2020) Galvan et al. (2022), Zhou et al. (2023),
Hosseini-Pozveh et al. (2022), Al-sharoa
and Rahahleh (2023)
Semi-supervised Li et al. (2018), Nan et al. (2018), He et al. Qin and Lei (2021), Lu et al. (2020), Feng Fang and Lin (2022), He et al. (2022), Yuan
(2019) et al. (2021), De Santo et al. (2021) et al. (2023)
Supervised – Cai et al. (2020) Ali et al. (2023), Costa and Ralha (2023),
Cai et al. (2022)
Multi-objective optimization model Hierarchical Liu and Ma (2019), Li et al. (2019), Haq – Mishra et al. (2021), Qie et al. (2022), Zhao
et al. (2019) et al. (2023)
Meta-heuristic Bello-Orgaz et al. (2018), Li et al. (2019), Malhotra (2021), Zhang et al. (2020), Reihanian et al. (2023), Shen et al. (2022),
Rahimi et al. (2018), Pattanayak et al. Zhang et al. (2020), Su et al. (2021), Ma Shang et al. (2022), Yang et al. (2022),
(2019), Shahmoradi et al. (2019), Sun et al. (2021), Pérez-Peló et al. (2021), Koc (2022), Sun et al. (2023), Wang et al.
et al. (2018), Li and Liu (2018), Žalik Wan et al. (2020), Belli et al. (2020), (2022)
and Žalik (2018), Cheng et al. (2018), Pourabbasi et al. (2021)
Yuanyuan and Xiyu (2018), Zou et al.
(2019), Ebrahimi et al. (2018), Zhu et al.
(2018), Moradi and Parsa (2019), Chen
and Bi (2019)
Game model Non-cooperative Moscato et al. (2019) Wang et al. (2021) –
Cooperative – Zhou et al. (2020), Ayachi et al. (2021a) –
Page 7 of 47
93
93 Page 8 of 47 Social Network Analysis and Mining (2024) 14:93

Fig. 2  Evolution of distribu-


tion of WoS-indexed relevant
literature related to community
models in complex information
networks published from Janu-
ary 2018 to June 2023

a directed graph than ∃ i, j ∈ N = {1, … , n}, Aij ≠ Aji . If


G is an unweighted graph (resp. a weighted graph) then
∀ i, j ∈ N, Aij ∈ {0, 1} (resp. ∃ i, j ∈ N, Aij > 1). Finally, if
G is a signed graph then Aij ∈ {−w, 0, w}, w > 0.
In Graph model, community structure is defined
as clustering graph ver tices into k subgraphs:
G1 = (V1 , E1 ), … , Gk = (Vk , Ek ) s.t. V = ∪ki Vi . In the case
Vi ∩ Vj = �, ∀i, j ∈ {1, … , k} we have a Graph partitioning,
otherwise, we refer to it simply by Community detection.
In the literature, graph model community detection is
generally used to represent communities in the context of
recommender systems (Deng et al. 2016; Symeonidis and
Fig. 3  Distribution of WoS-indexed relevant literature related to com- Mantas 2013) and network security applications such as
munity models in complex information networks published from Jan- Sybil detection (Zhang and Sanjeev 2014; Gao et al. 2018).
uary 2018 to June 2023
Community detection methods in graph-based commu-
nity models may be classified on the basis of the framework
5 Graph model used into three main classes: Graph pattern detection, Spec-
tral clustering and Stochastic approaches. In the following
An interconnection network is represented as a weighted subsections, we describe each of these frameworks as well
graph G(V, E, W) or unweighed graph G(V, E), where V is as some of relevant related research works.
the vertex set, E ⊆ V 2 is the edges set and W ∶ E → ℝ is a
weight function.
The adjacency matrix of the graph G, denoted by A, is a 5.1 Graph pattern detection
(n × n)-matrix where n = |V|, which is defined as follows:
The rows and the columns of A are indexed by the vertices It consists on detecting highly cohesive subgraphs over the
of G. If two vertices i and j are non-adjacent then the (i, j)- overall sparse interconnection graph. The graph structures
component of A is 0, on the other hand, (i, j)-component that are commonly used to represent communities in litera-
is non-zero if i and j are adjacent. We denote (i, j) -com- ture are:
ponent of A by Aij . The type of graph G determines the
possible values of Aij . Thus if G is an undirected graph, • k-clique Complete subgraph of size k. Palla et al. (2005)
then Aij = Aji , ∀ i, j ∈ N = {1, … , n}. Otherwise, if G is found that overlaps between communities are very com-
mon in complex networks. They defined community as
Social Network Analysis and Mining (2024) 14:93 Page 9 of 47 93

the union of all k-cliques that can be accessed from each On the other hand, Zhang et al. (2014) proposed an
other through a succession of adjacent k-cliques, where approach based on k-clique clustering to identify the inter-
adjacent k-cliques are k-cliques which share k − 1 nodes. net opinion leader community. Firstly, the maximum size of
• N-club Mokken (1979) proposed the concept of N-club to the clique (k) is determined according to the degree of each
represent cluster concepts in a graph G. N-club is defined node. Then, the k-clique which contains the certain node is
as a maximal subgraph of G of diameter N. computed. After that, every edge that is neighbored by the
• k-plex Wang et al. (2017) proposed a modelization of node in the last step is deleted. This step is repeated until
communities through k-plex. The bounded diameter of finding all the k-cliques.
k-plex, and its upper bound size are all good properties
for representing a cohesive community structure. The for- 5.1.2 Heuristic approaches
mal definition of such structure is as follows: G� (V � , E� )
is a k-plex in G(V, E) if:∀v ∈ V � ∶ degin G�
(v) ≥ |V � | − k . Since detecting graph structures such as N-cliques, N-club,
degG′ is the internal degree of v in G , |V | is the cardinal-
in ′ ′ etc. is an NP-hard problem, many heuristic techniques have
ity of V ′. been proposed. Heuristic algorithms are dependent on the
• LS-set The concept of LS-set was pioneered by AMI problem they are trying to solve. Their solutions are based
(1972). The idea is that an LS-set is the union of its sub- on exploitation and exploration concepts for finding optimal
sets, and the result is better than the subsets because it solutions. However, they are usually trapped in local optimum.
has less interconnection with the rest of the network. In Zhu et al. (2020), proposed a community detection
More formally, G� (V � , E� ) is LS-set of a graph G(V, E) algorithm called Modularity Optimization with k-Plexes
if ∀u, v ∈ V � and w ∈ V − V � , 𝜆(u, v) > 𝜆(u, w)., where 𝜆 (MOKP). First, k-plexes are computed using a BS algo-
denotes the edge connectivity between two nodes, which rithm; these k-plexes are considered as community seeds,
is the minimum number of edges that must be removed then the remaining nodes are assigned by modularity optimi-
in order to disconnect them. zation. To save computational time, the authors proposed an
• k-truss A k-truss of a graph G is the maximal subgraph improved version of MOKP which is based on reducing the
such that each edge belongs to at least k − 2 triangles in scale of the network by removing all nodes whose degrees
this subgraph (Zheng et al. 2017). Zheng et al. (2017) are less than (z − k), where z is the lower bound of the com-
proposed a novel community model, called Weighted munity size, and adjusting nodes assignment rules.
k-truss community, which is based on the concept of In the same scope, Yang et al. (2022) developped an
k-truss. To better characterize community properties, this algorithm called Quadratic Optimization based Clique
model considers the edge weights. Then, the authors pro- Expansion (QOCE) to detect overlapping communities in
posed a BFS-based online search algorithm for depicting very large networks. This algorithm starts by depicting high-
the top r weighted k-truss communities in O(m1.5 ) time, quality maximal cliques and considering them as the initial
such that m is the number of edges in a network. seeds. Then, the reminding nodes are affected to the previ-
ously detect seeds by applying quadratic optimization.
To depict such graph structures representing communities, Moreover, to overcome the drawbacks of the existing
there are two main approaches in literature: deterministic, clique based scheme for community detection, Ramesh and
and heuristic based approaches. Srivatsun (2021) proposed a new genetic algorithm based
community detection. They relied on merged-maximal-
5.1.1 Deterministic approaches clique based representation schemes in order to reduce
chromosomes length with a fewer number of cliques. Fur-
Deterministic algorithms generate solutions in an iterative thermore, the operators of the Evolutionary algorithm are
incremental way, such that after a number of repetitions, the enhanced by renumbering the solutions to avoid redundant
algorithms will converge to an optimal solution. Although, solutions during the execution of the algorithm.
deterministic algorithms give generally an optimal solution,
they are time-consuming in large datasets cases (Li et al.
2014; Palla et al. 2005). 5.2 Spectral clustering
Li et al. (2014) proposed an algorithm based on maximal
cliques to detect overlapping communities, and the bridge Let G = (V, E, W) be a weighted graph, W its adja-
vertices between them, in the unweighted and weighted cency matrix, the (i, j)-component of W represents the
networks. First, all the maximal cliques are depicted using weight of the edge between nodes i, j ∈ V , such that
Depth and Breadth searching. Then, two maximal cliques Wij ≥ 0, ∀ i, j ∈ {1, … , n}, where n = |V|. Consider the
may be merged into a larger cohesive subgraph according diagonal matrix D = [dij ]n×n , where dij = 0 if i ≠ j and
∑n
to some given rules. dii = j=1 wij is the degree of node vi.
93 Page 10 of 47 Social Network Analysis and Mining (2024) 14:93

The unnormalized graph Laplacian matrix of G, way partition can be formulated as follows: min Θ(C) sub-
L = [lij ]n×n is given by: L = D − W . In the case of an ject to Ci ∩ Cj = �, ∀i, j ∈ {1, … , k}.
unweigheted graph, the adjacency matrix is considered In this context, the algorithm proposed by Alpert et al.
instead of the weighted matrix, i.e. L = D − A. The Lapla- (1999) is considered being one of the first major research
cian matrix L has the following properties (Von Luxburg works which undertook the problem of K-way partitioning.
2007): The idea is to find a good vector partitioning solution
Pk = {P1 , … , Pk } in a set of vectors Y, such that every cluster
1 ∑ n ∑n Pi , i ∈ {1, … , k} consists of a subset of vectors that sum to a
• ∀x ∈ ℝn ∶ xt Lx = w (x − xj )2 , where xi is
2 i=1 j=1 ij i vector of large magnitude. Thus, the function to be optimized
the ith component of x. ∑ � �2
• L is a symmetric positive semi-definite matrix. is given by the formula: f (Pk ) = kj=1 �Yj � such that
∑ � �
• The smallest eigenvalue of L is 𝜆1 = 0 with the associated Yi = y∈Pi y. By considering U = [uij ]n×d the matrix whose
eigenvector 1 = (1, … , 1)t. columns are the first d eigenvectors of the Laplacian matrix
L of the graph, the authors defined a matrix√ of scaled eigen-
In order to solve community detection problem, spectral vectors Vd = [vij ]n×d , such that vij = uij H − 𝜆j , where 𝜆j is
graph theory can be an effective method. Indeed, the relaxa- the jth eigenvalue of the Laplacian matrix of the graph,
tion of graph clustering allows the exploration of the eigen- H ≥ 𝜆n, where 𝜆n is the nth eigenvalue. First, S is initialized
values and eigenvectors properties of their Laplacian and to the empty set S = �, then iteratively, the vector ydi ∈ Y that
adjacency matrices. Spectral clustering approaches can be �∑ �
maximizes � y∈S∪{yd } y� is added to S, removed from Y and
classified into three main categories: Two-way partitioning, � i �
labeled with vertex in order. Thus, at any stage of the algo-
k-way partitioning and Relaxation.
rithm, S consists of a good vector partitioning subset. The
time complexity of the algorithm is estimated to O(dn2 ).
5.2.1 Two‑way partitioning In the same vein, Symeonidis and Mantas (2013) pro-
posed a multiway spectral clustering approach to provide
Hogen and Kahngs’ research work (Hagen and Kahng 1992) friend recommendation. Their approach uses the first few
was one of the earliest proposals that defined the relation eigenvectors and eigenvalues of the normalized Lapla-
between the eigenvectors of the Laplacian matrix L of a cian matrix and computes a multi-way partition of the
graph G and the solution of a two-way ratio cut problem. obtained data. Firstly, the normalized Laplacian matrix
Besides that, they found a lower bound of the problem, L of the adjacency matrix A is computed as follows:
𝜆 1 1
which is 2 such that 𝜆2 is the second eigenvalue of the L = D− 2 (D − A) × D− 2 . Then, the k first eigenvectors of
n
Laplacian matrix and n is the number of nodes. L ∶ u1 , … , ud are computed and stored as columns of a
Moreover, Zhang and Sanjeev (2014) proposed a two- matrix U. Finally, by considering vectors vi ∈ ℝd , with
way spectral clustering approach to detect Shilling attacks in i = 1, … , n as the different n lines of U, a k-means cluster-
recommender systems in the case where the attack profiles ing algorithm is applied to the resulting data points v1 , … , vn
are highly correlated. For this purpose, they modeled the in order to find k clusters.
problem as finding a maximum submatrix in the user-user Moreover, Newman (2006) focused on demonstrating
similarity matrix. More specifically, the user-user similarity how maximization process of modularity optimization can
matrix is translated into a graph. Then, an iterative two-way be written in terms of the eigenspectrum of a matrix known
spectral clustering algorithm is applied to find the min-cut as Modularity matrix. This matrix is defined as follows:
solution of the highly correlated group. To ensure the appli- Bij = Aij − Pij , where, A is the adjacency matrix of the inter-
cability of the approach in the case of unbalanced clustering, connection graph and Pij is the probability that nodes i and
the graph is created based on the edge density. j will be connected by a vertex.

5.2.2 K‑way partitioning 5.2.3 Relaxation

Considering the graph G(V, E, W), let C = {C1 , … , Ck } be Jiang and McQuay (2012) showed that the maximization of
a set of k subsets of V. We can define Cut edges as a set quantitative functions including Modularity and Modular-
C = {(u, v) ∈ E|u ∈ Ci , v ∈ Cj , 1 ≤ i < j ≤ k}. Moreover, the ity density for community detection can be formulated as

cut size of C is defined as Θ(C) = w(u, v) , where a combinatoric optimization problem based on modularity
(u,v)∈Cut
Laplacian matrix. Furthermore, instead of applying tradi-
w(u, v) is the weight of the edge (u, v). The problem of k−
tional spectral relaxation, they applied additional non-nega-
tive constraint to the graph clustering problem and proposed
Social Network Analysis and Mining (2024) 14:93 Page 11 of 47 93

an efficient algorithm to optimize the new objective. With S(t) = g(t)R(t) + (1 − g(t))f (D),
the explicit non-negative constraint, the proposed solution
converges to the ideal community indicator matrix and can where
directly classify nodes into communities. Moreover, the
f (D) = [f (Dij )],
near-orthogonal columns of the solution can be reformulated
as the posterior probability of nodes belongings to commu- and
nities. Consequently, the proposed method can be used to
{
depict the fuzzy or overlapping communities and thus helps N−t
, if 0 < t < N,
the understanding of the intrinsic structure of networks. g(t) = N
0, otherwise,
{1
5.3 Stochastic approaches P , if 0 < Dij < 2 × davg ,
f (Dij ) = 2 ij
0, otherwise,
Community in interconnection graph can be considered
D ∈ ℝ|V|×|V| is the density matrix, R(t) is the rate matrix at
as a subnetwork in which nodes have higher probability to
the iteration t, g(t)R(t) serves as a reinforcement term and
form edges with nodes that belong to the same community
(1 − g(t))f (D) as the density term, davg is the average node
than with the other nodes of the network. More formally,
density. Finally, the iteration of the algorithm is given by:
let G� (V � , E� ) be a subgraph of a the graph G(V, E) , Pij is
the probability to form an edge between nodes vi and vj . In r(t+1) = 𝛼 Norm(P◦[U + S(t) ]) ∗ r(t) + (1 − 𝛼)r(0) ,
literature, there are two main stochastic models to define
communities (Fortunato 2010): where the operator A◦B is the Hadamard product, U is
|V| × |V| all-one matrix.
• Strong stochastic model: G� (V � , E� ) is a strong stochastic In the context of the security field, Gao et al. (2018)
community, if: ∀i ∈ V � , ∀j ∈ V � , ∀k ∈ V − V � ∶ Pij > Pik. proposed SYBILFUSE algorithm, a defense-in-depth
• Weak stochastic model: G� (V � , E� ) ∑ is a weak ∑stochastic framework for detecting Sybil attacks. First, the algo-
j∈V � Pij Pij rithm trains a local classifier (e.g. Support vector machine,
community in G(V, E), if: ∀i ∈ V � , > j∈V−V �
.
�V � � �V � � Logistic regression, etc.) to compute local trust commu-
nity scores for nodes and edges. Then, it uses a weighted
In literature, Stochastic approaches can be classified into two
random walk and a loopy belief propagation mechanism to
relevant classes: Random walk approaches and Statistical
propagate the local scores through the network.
inference approaches.
Okamoto and Qiu (2022) proposed a formulation to
decompose random walk into localized random walks to
5.3.1 Random walk detect pervasive communities. They described pervasive
communities as network communities with no determined
Considering the high density of intra-cluster edges, Random boundaries. These are defined by a conditional probabil-
walk model is based on the idea that vertices belonging to ity p(n|k) which represents the relative belongingness of
the same community are quickly reachable from each other, node n to community k. Furthermore, they described the
and the random walker has higher probability to stay in the problem of detecting hierarchical pervasive communities
same community. in the network.
So if we consider G� (V � , E� ) a subgraph of G(V, E) The different research works discussed in this subsec-
and a random walk of length l that starts at i ∈ V � tion are just few substantial contributions in the context
(l ≥ diameter(G)), Pj (t) is the probability that the random of community detection based on Random walk approach.
walker is in node j at step t, G′ is a community if: As described in Fig. 1, the reader can find other relevant
research works published in this context in the following
∀j ∈ V � , ∀k ∈ V − V � ∶ Pj (t) > Pk (t).
references: (Li et al. 2018; Wu et al. 2021; Li et al. 2022).
Yi et al. (2021) proposed a set of algorithms based on a
density sensitive random walk to detect local communities
and avoid both cluster boundary and unbalanced walk issues. 5.3.2 Statistical inference
First, the authors defined the concept of graph density infor-
mation which allows the algorithm to identity the node near Contisciani et al. (2022) proposed a probabilistic genera-
the community boundary and the nodes with higher graph tive model based on statistical inference for hypergraph with
density. For this, they defined the term S(t) as overlapping communities. This model provides an efficient
93 Page 12 of 47 Social Network Analysis and Mining (2024) 14:93

approach to detect overlapping communities. Let H(V, E) be the works of Roozbahani et al. (2023) for the case of
an hypergraph with V = {n1 , … , nN } and E = {e1 , … , ee }; multi-layer networks, Chen and Mo (2022) for the case
H(V, E) can be represented as an adjacency tensor A such of weighted multi-layer networks, Sun et al. (2023)
that entries An1 ,…,nd represent the weights of a d-dimensional for dynamic networks. Unfortunately, there still little
interaction between the nodes n1 , … , nd . Let 𝜃 = (u, w) be research directly investigating these issues. The main
a set of latent variables, such that u is n × k dimensional reason that research to date has tended to focus on undi-
community membership and w is the affinity matrix. The rected unweighted graphs rather than the other types of
likelihood of observing the hypergraph given 𝜃 can be mod- graphs is that many problems of community detection
eled as the probability: in the different types of graphs can be translated into
problems of community detection in unweighted undi-
∏ A
𝜆e e ∑ ∏ rected graphs. Indeed, a simple common way to deal
P(A|𝜃) = e−𝜆e , s.t. 𝜆e = wde k uik .
e∈Ω
Ae ! k i∈e with directed graph when searching for communities
is to ignore directionality and assume edge symmetry
To infer the latent variables 𝜃 = (u, w) given the hypergraph (Malliaros and Vazirgiannis 2013). In the context of
A, the approach considers both maximum likelihood estima- community detection, a bipartite graph can be trans-
tion assuming uniform priors on the parameters and maxi- formed into a non-bipartite graph by multiplying the
mum a posteriori estimation assuming non-uniform priors. corresponding adjacency matrix by its transpose. The
Thus, the algorithmic update for the membership vector is weight between two nodes in the resulting graph is
given by: positively correlated to their Jacquard similarity in the
∑ original bipartite graph. Moreover, multi-layer net-
e∈E Bie 𝜌ek
uik = ∑ ∏ , works describe multi-stranded relationships between
e∈Ω�i∈e wde k j∈e�j≠i ujk nodes, each type of relationship (trust, citation, simi-
larity, etc.) is described in one simple layer (Kim and
where Bid is the weight of the hypergraph e to which node
Lee 2015). Thus, community detection for every type
i belongs, and 𝜌 is a variational distribution determined in
of relationship can be considered as community detec-
the expectation step of the Expectation Maximization EM
tion in a simple network. Finally, to depict communi-
procedure. In the same scope, the algorithmic updates for
ties in dynamic networks, the majority of approaches
the affinity matrix is given by:
take snapshots of the network topology (Rossetti and

e∈E�de =d Ae 𝜌ek
Cazabet 2018), then, the usual community detection
wdk = ∑ ∏ . approaches are applied to find communities in the cur-
j∈e ujk
e∈Ω�de =d
rent snapshot of the network (static state).
• Another important scenario that could occur in real-
Furthermore, Chen and Mo (2022) proposed a stochastic
block model to represent multilayer weighted networks, and world complex networks is the presence of signed edges
developed a framework based on statistical inference for an with positive and negative weights, indicating attractive
efficient community detection. Their approach used a Vari- and repulsive interactions, respectively. Especially in the
ational expectation-maximization algorithm to estimate the case of correlation data and trust/distrust networks (Hos-
parameter of interest, such as connectivity strength matrix. seini-Pozveh et al. 2022). Ideal partitions in this case
would have positively weighted intra-cluster edges and
5.4 Concluding remarks and discussion negatively weighted inter-cluster edges. Unfortunately,
few studies have been published for dealing with such
Table 4 summarizes a comparative study of relevant research networks. Thus, more research on this topic needs to be
works relative to community detection approaches based on undertaken in depth. Furthermore, signed and directed
Graph model. graphs are dealt using undirected graphs with computed
weights (Hosseini-Pozveh et al. 2022).
• Much of the current literature on community detec- • Graph model for community detection is accurate for
tion pays particular attention on developing algorithms detecting both overlapping and non-overlapping commu-
that deal with undirected unweighted graphs. How- nities with an acceptable time complexity (mainly poly-
ever, real interconnection networks can be directed, nomial or linear). Furthermore, since that several works
bipartite, dynamic, multi-layered, and have weighted are parallelizable, it would be interesting to use massive
connections or nodes with features. Some research- and distributed computing to enhance convergence speed
ers have developed many approaches for dealing with of the algorithms.
such networks, some important examples of these are
Table 4  Comparative table of characteristics of complex network communities based on Graph model
Classification Works Criteria

Model Framework Approach Community type Links Distribution Complexity Scalability Application
Social Network Analysis and Mining

Graph Pattern detection Deterministic Zhang et al. (2014) Overlapping Undirected Parallelizable O(𝛼 𝛽 ln(n) ), 𝛼 , 𝛽 : fitting parameters Considered Opinions’ analysis
Heuristic Zhu et al. (2020) Partitioning Undirected Centralized Linear Considered Community detection in
complex network
Deng et al. (2016) Overlapping Directed Centralized – Not considered Recommender system
(2024) 14:93

Yang et al. (2022) Overlapping Undirected Distributed ( d3 ) 1∑ Considered Overlapping community


O(d × n × 3 + �S� + n3 ) detction in complex
n s∈S s ,
ns ∶number of nodes in the sampled network
subgraph s ∈ S
Spectral clustering Two-way partitioning Zhang and Sanjeev Partitioning Undirected Centralized O(n3 ) Not considered Sybil/shilling defense
(2014)
K-way partitions Francisquini et al. Partitioning Unidrected Parallelizable O(n2 ) Considered Anomaly detection
(2022)
Symeonidis and Man- Partitioning Undirected Centralized O(n × r) + O(k × M(n)), r: number Considered Friend recommender
tas (2013) of matrices, vector operations, M(n)
: cost of a matrix vector product
Stochastic Random walk Gao et al. (2018) Partitioning Undirected Centralized O(m × r), r: number of iterations Considered Sybil defense
Yi et al. (2021) Partitioning Undirected Parallelizable O(m) Considered Community detection in
complex networks
Okamoto and Qiu Overlapping Undirected Centralized – Not considered Community detection in
(2022) complex networks
Li et al. (2022) Overlapping Undirected Parallelizable log(m) Considered Community detection in
O(m + ) complex networks
𝜖𝛼
Statistical inference Contisciani et al. Overlapping Undirected Parallelizable O(n × h × k) h : maximum hyper- Considered Community detection in
(2022) edge size in the dataset complex networks
Chen and Mo (2022) Partitioning Undirected Parallelizable O(n2 × k2 × l), l : number of layers Considered Community detection in
(weighted) multilayer weighted
complex network

n: number of nodes in the network. D: average degree of nodes. K: number of clusters. m: number of edges. dim: dimension
Page 13 of 47
93
93 Page 14 of 47 Social Network Analysis and Mining (2024) 14:93

6 Machine learning model fuzzy clustering algorithm iteratively updates the objective
function and the membership matrix u according to the fol-
Machine learning intends to learn optimal solutions from lowing formula, until convergence:
previous experiences E, with respect to community detection
1
task and fitness functions (performance measures), in order uij =
� � m−1
1
.
to improve the performance P with the future experiences ∑�C� (xi , vj )
of classifying new nodes into communities (Mitchell 1997). l=1 (xi , vl )
As machine-learning algorithms could be very accurate
approaches to classify or cluster nodes in complex networks In their paper, Niu et al. (2023) proposed an Overlapping
based on their features in content-based network analysis, community detection algorithm based on adaptive Density
they have several practical real world applications such as Peaks clustering with Iterative partition strategy algorithm
service composition (Gupta et al. 2018), social communities (ODPI). The algorithm uses a customized distance function
detection (Niu et al. 2023; Zhang and Zhou 2020), and trust to measure the distance between each pair of nodes. Thus,
circles detection (Jia et al. 2017). the distance matrix can be computed according to the fol-
1
In this scoop, we distinguish three main frameworks lowing equation: dist(i, j) = , such that dist(i, j) is
ls(i, j) + 𝜖
used for community detection: Unsupervised learning,
the distance between nodes vi and vj and ls(i, j) is the linking
Semi-supervised learning and Supervised learning. In the
s t r e n g t h w h i c h i s g i ve n b y t h e fo r m u l a :
upcoming subsections we describe these three frameworks (lc(i, j) + Aij + Bij ) × (|Vij | + 1)
and illustrate each by some relevant research works. ls(i, j) = , where
min(Ii , Ij )
( ( ))
∑ wipj − maxw
6.1 Unsupervised learning lc(i, j) = wipj × exp − ,
p∈V
r×t+𝜂
ij
Unsupervised learning approaches aim to cluster unlabeled
nodes with no preexisting labels (communities). Unsuper- such that Vij is the set of common neighbors between nodes
vised learning based approaches can be divided into five vi and vj , wipj = min(Aip , Apj ), maxw is the maximum value
main relevant classes: Clustering, Autoencoder, Label prop- of weight, r = maxw − min(A) is the range of weights in A,
agation, Non-negative matrix factorization, Deep non nega- t ∈ [0, 1] controls the influence of common neighbors on the
tive matrix factorization. linking strength, 𝜂 is a small positive number that prevents
the denominator from being equal to zero, Ii and Ij are the
6.1.1 Clustering sums of the weights of the out-edges and in-edges of vi and
vj , respectively, Bij is Jaccard similarity coefficient between
Let V = {p1 , p2 , … , pn } be the set of nodes of a complex vi and vj (Al-Oufi et al. 2012). ODPI algorithm uses an adap-
network where each node pi , i ∈ {1, … , n} is considered tive local density calculation method which is given by the
as a vector of d features: pi = [pi1 , pi2 , … , pid ]t . The aim of following formula:
unsupervised clustering approaches is to partition the set √

of nodes V into clusters (communities): C = {C1 , … , Ck } √ 1 ∑ k
dc = √ (𝜃 k − 𝜇k )2 ,
which optimize specific objective functions that measure the N − 1 i=1 i
clustering quality.
In this setting, Lu et al. (2020) proposed an approach where 𝜃ik is the distance between node vi and its kth nearest
for community detection based on a multiple-kernel neighbor, 𝜇 is the mean value of all nodes, and
combination fuzzy clustering constituted of four main 1 ∑
k = max(⌊ degree(vi )⌋, 1) . In addition to that, a
phases. First, random walk probability transition matrix �V� vi ∈V
Pk = [x1 , x2 , … , xn ] is computed in different steps k. Then, cluster center assignment method based the iterative k-means
multiple base kernels are computed. After that, a new kernel clustering algorithm is used. Finally, a strategy for assigning
matrix 𝜃 is calculated by a linear combination; the objective overlapping communities to nodes was proposed. Indeed, for
function to optimize is given by the formula: every node whose neighbors belong to different communi-
ties, the degrees of belongingness to every neighboring com-

N

k
munity is evaluated, given the sequence of community labels
J𝜆 = um T
ij (𝜃(xi ) − vj ) (𝜃(xi ) − vj ), of all nodes in the network: {cl1 , … , cl|V| }. The Affiliation
i=1 c=1
Degree for community c ∈ C of node i is given by the fol-
where uij is the degree of membership of data point pi to lowing equation:
∑�C�
cluster Cj ∈ C , l=1 uil = 1, m is the hyperparameter to con-
trol the membership degree of nodes, m > 0 . Finally, the
Social Network Analysis and Mining (2024) 14:93 Page 15 of 47 93


� p∈KNNj ,clp =c cls(i, p) both the initial graph structure and modularity-based prior
ADci = ls(i, j) ∑ , communities when computing embedding spaces. The steps
cls(j, p)
j∈KNNi ,clj =c p∈KNNj of their proposed method are as follows: firstly, community
membership matrix Ac of the network is computed through
where KNNi is the set of K nearest neighbors of node vi ,
⋃ Louvain algorithm. Then, Ac matrix is s-regular sparsified
c ∈ vj ∈KNNi {clj }. If ADci > 𝜎 ADi i , then node vi will be
cl
in order to obtain a new community membership matrix As
assigned to community c. in which each node in a given community Cl is connected
to only s < nl randomly selected nodes in Cl , where nl is
the number of nodes in Cl . Next, As is combined with input
6.1.2 Autoencoder
graph data, i.e. adjacency matrix A and nodes features matrix
X as follows: (A, X) = (A + 𝜆As , X). After that, the resulting
To reduce the computational complexity, it is often use-
network data are processed by the proposed community-
ful to convert High-dimensional data to low-dimensional
preserving encoders, encoding each node i of the network
codes using Multi-layer neural network. The principle
as an embedding vector zi of dimension d ≪ n. The optimi-
consists of training a Neural network with a reduced cen-
zation of neural weights of encoders is ensured by using both
tral layer whose corresponding data is more adequate for
reconstruction and modularity-inspired losses. Finally, the
clustering, these neural networks should then be capable of
reconstructed graph by the decoder is used for community
reconstructing the initial high dimensional input vectors,
detection, by applying one of the many clustering algorithm
this class of neural networks is known as Autoencoders.
for Euclidean data such as k-means algorithm.
Indeed, Autoencoders (AEs) can depict underlying nonlin-
In the same vein, Zhou et al. (2023) proposed a Community
ear correlations in real networks and detect communities
Detection framework Based on Unsupervised Attributed Net-
from reconstruction. A typical AE (Hinton and Salakhut-
work Embedding (CDBNE) capable of controlling and captur-
dinov 2006) is composed of an encoder H = 𝜙e (A, X) , a
ing the temporal features of a network. This framework jointly
decoder  = 𝜙r (H) or X̂ = 𝜙r (H), and a bottleneck, which
models information about network topology and nodes’ fea-
is the Code hidden layer where the encoding is pro-
tures with the graph attention autoencoder. Then, it combines
duced. The encoder 𝜙e maps high dimensional network
the output with a prior mesoscopic community representation
which is represented by its adjacency matrix A and its
of the network using an Encoder/Decoder. Finally, in order to
possible attributes X into a lower dimensional encod-
obtain a high-quality representation of nodes, a self-training
ing H = [hij ]d×n , where n is the number of nodes in the
clustering module optimizes the representation of the learn-
network, and d << n . The ith column of H, i.e. Hi , rep-
ing process.
resents the features of the node vi in the latent space.
Hi = 𝜎(WH Xi + dH ) , WH ∈ ℝd×n , dH ∈ ℝd are learned
6.1.3 Label propagation
encoder parameters 𝜎(.) is an element-wise non-linear
activation function. The decoder 𝜙r takes the encoding
Let G = (V, E) be a graph, A its adjacency matrix, and ci the
and recreates back the encoder input, i.e. network struc-
label of node vi ∈ V, i ∈ {1, ..., |V|}. Generally, in the con-
ture  or its attributes X̂ . The objective is to maximize the
text of community detection, each node vi is initially labeled
equivalence between input data x and its reconstructed
with a different label ci = i . The principle of Label Propa-
output 𝜙r (𝜙e )(x). This is ensured by minimizing the recon-
gation Algorithm (LPA) consists on iteratively choosing a
struction loss function min L(x, 𝜙r (𝜙e )(x)) which could be
random node vi ∈ V and change its label to a new label c′i
Mean Square Error or Binary Cross Entropy. The main
which is uniformly chosen from the set of the most frequent
idea consists on applying classical clustering algorithms
label in the neighborhood of vi.
such as k-means, Non-negative matrix factorization, etc.
Hosseini-Pozveh et al. (2022) proposed a label propaga-
on the lower dimension matrix H = 𝜙e (A, X) which has
tion based algorithm for community detection in directed
more obvious community structure.
signed social networks. As the available LPA-based methods
Salha-Galvan et al. (2022) proposed a community detec-
do not consider the case of signed directed networks, the
tion method based on enhanced version of Graph Autoen-
authors proposed a method that utilizes the direction infor-
coder (GAE) and Variational Graph Autoencoders (VGAE).
mation to convert the considered signed directed network to
First, they studied the reasons why GAE and VGAE are
an undirected weighted network. In this context, the weight
generally outperformed by other frameworks on commu-
W1,ij of every edge (vi , vj ) from node vi to node vj is computed
nity detection, especially when dealing with networks whose
as follows: if the edge is positive (resp. regative), then:
nodes do not have features. Then, they proposed a commu-
nity-preserving message passing scheme to enhance their
GAE and VGAE encoders, this is ensured by considering
93 Page 16 of 47 Social Network Analysis and Mining (2024) 14:93

Kiout+ × Kjin+ Kiout− × Kjin− Factorization (NMF) based community detection methods
W1,ij = 1 − (resp. W1,ij = 1 − ), follow this general steps (Liu et al. 2023): first, A is decom-
di+ × di+ di− × di−
posed into lower rank non-negative basis matrices A = UX T ,
where Kiout+ (resp. Kiin+ ) is the positive out-degree (resp. U ≥ 0 , X ≥ 0 . This decomposition is solved using an opti-
positive in-degree) of the node vi ; Kiout− (resp. Kiin−) is the mization algorithm. Then, the graph nodes are assigned to
negative out-degree (resp. negative in-degree) of the node the communities according to the obtained model as fol-
vi ; and di+ (resp. di−) is the positive degree (resp. negative lows: assuming that the graph has K underlying communities
degree) of the node vi. C1 , … , CK , the algorithm learns  a rank-K approximation of
Furthermore, W1,ij is combined with another weight W2,ij A, Â = UX T , such that U ≥ 0 is a set of base vectors, X ≥ 0
computed from the sign information of the edges as follows: is a coefficient matrix in which each row represents a node
if directed edge (vi , vj ) is positive, then in the target network. Finally, X can be considered as a soft
threshold indicator that identifies the belongingness of nodes
⎧ Bij − Uij to communities, i.e. ∀j ∈ {1, … , n} and k ∈ {1, … , K},
⎪ if �{N(i), vi } ∪ {N(j), vj }� ≠ 0, xjk = Pr(vj ∈ Ck ), this can be formulated as follows:
W2,ij = ⎨ �{N(i), vi } ∪ {N(j), vj }�
⎪0
⎩ otherwise, ∀vj ∈ V ∶ max{xjs |s = 1, … , K} = xjk ⇒ vj ∈ Ck .

else, if the edge is negative, then Liu et al. (2023) proposed a Constraint-induced Symmet-
ric Non-negative Matrix Factorization (C-SNMF) model
⎧ Uij − Bij allowing to model undirected network with many latent
⎪ if �{N(i), vi } ∪ {N(j), vj }� ≠ 0,
W2,ij = ⎨ �{N(i), vi } ∪ {N(j), vj }� feature matrices. In addition to that, the model integrates a
⎪0 otherwise, symmetry-regularizer into its objective function in order to

conserve the symmetry property of the low-rank approxima-
where, Bij , Uij , |N + (i)|, |N + (j)|, |N − (i)| and |N − (j)| are the tion. Furthermore, to ensure local invariance of the target
number of balanced triads of edge (vi , vj ), the number of network, the model incorporates the graph-regularization.
unbalanced triads of edge (vi , vj ) , the number of positive Thus, the objective function of the model is formulated as
neighbors of vi , the number of positive neighbors of vj , follows:
the number of negative neighbors of vi , and the number of
negative neighbors of vj , respectively. Finally, the authors min F(U, X) =‖A − UX T ‖2F
extended the label propagation algorithm, where the com- 𝜇
+ ‖UX T − XU T ‖2F + 𝜆tr(X T LX),
bined weights W1,ij and W2,ij are applied for a more accurate 2
community detection in singed directed networks. where, A is the adjacency matrix, X ≥ 0, U ≥ 0 denote the
In the same vein, Traag and Šubelj (2023) proposed Fast latent feature matrices, 𝜇 > 0 , 𝜆 > 0 are scale parameters,
Label Propagation Algorithm (FLPA) which is based on ‖.‖F is the Frobenius norm, tr(.) computes the trace of a
the same principles used for label propagation algorithms, matrix, L is the Laplacian matrix of A.
namely, a label ci is associated to each node vi ∈ V in the In the same scope, Fang and Lin (2022) proposed an algo-
network, and a majority update rule is used to update labels rithm that uses non-negative matrix factorization to detect
of nodes at each iteration. However, instead of iterating over overlapping network communities. First, an objective func-
all nodes whenever a label is changed, the algorithm only tion qualified by Kullbac-Leibler divergence was proposed,
considers nodes in whose neighborhood a label has been then, a new form of non-negative matrix factorization was
updated. For this, the algorithm maintains an explicit queue used to find a solution space.
of nodes Q. After each iteration, if the label ci of the node vi
is changed, some of its neighbors Ni = {vj |(vi , vj ) ∈ E} are 6.1.5 Deep non negative matrix factorization
added to the queue Q. Namely, each neighbor vj ∈ Ni whose
label is different from the updated label of vi and does not Based on the same principle as deep layers in Autoencoders,
yet belong to the queue Q is appended to Q. At each step, Community detection with Deep Non-negative Matrix Fac-
a node is dequeued from Q, the process continues until the torization stacks multiple layers of NMF {U1 , … , Up }, p > 0
queue is empty. to depict node similarities in different levels. In this context,
Deep Autoencoder-like Non-negative Matrix Factorization
uses an Autoencodeur to insure network reconstruction on
6.1.4 Non‑negative matrix factorization
hierarchical mappings (Sun et al. 2017). In this context, the
objective function of community membership Xp > 0 and the
Let G = (V, E) be an interconnection graph network and
hierarchical mapping {Ui }i ≥ 0 are trained by aggregating
p
A its adjacency matrix. Generally, Non-Negative Matrix
Social Network Analysis and Mining (2024) 14:93 Page 17 of 47 93

both reconstruction loss and 𝜆 weighted graph regulariza- as an input to the Encoder. A T-order proximity of an adja-
tion; the 𝜆 weighted graph regularization aims to deal with cency matrix A ∈ ℝn×n is defined as follows: ST = S S … S,
⏟⏟⏟
topological similarity to cluster neighboring nodes. This T times
objective function is described in the following equation: where S ∈ ℝn×n is the first-order proximity matrix, with

min L(Xp , Ui ) =‖A − U1 … Up Xp ‖2F Aij , and depicts the local structure of nearest
Xp ,Ui sij = ∑n
A
j=1 ij
+ ‖X − UpT … U1T A‖2F + 𝜆tr(Xp LXpT ), neighbors nodes.
Thus, the average of T-order proximity W ∈ ℝn×n is then
s.t. ‖.‖F is the Frobenuis norm, L is the graph Laplacian S + S2 + … + ST
matrix. defined by the formula: W = , and it depicts
T
Zhao et al. (2022) proposed a new algorithm based on the global structure information of the network. In
Autoencoder Non-negative Matrix Factorization (ANMF) DRANMF model, the Encoder transforms the average high-
that considers both structural topology and node features order proximity matrix W to the cluster membership space.
information. Furthermore, the algorithm allows balancing On the other hand, the Decoder restores it. The aim of the
the contribution of each information using a flexible param- approach is to reduce the distance between the lower-level
eter. Hence, ANMF model tries to minimize the loss func- and higher level presentation of the network for optimal
tion described in the following equation: community detection. Thus, the objective function for
DRANMF is formulated as follows:
arg min ‖A − WH‖2F + ‖H − W T A‖2F
W,H,Y
min L = LD + LE + 𝜆Lreg .

k Ui ,Vp
+ f (H, Y)‖H T − CY‖2F + 𝛽 ‖Y(∶, r)‖21 ,
The Decoder term is defined by:
min D = ‖W − U1 U2 … Up Vp ‖2,1 , s.t. Ui ≥ 0, Vp ≥ 0, ∀i ∈ 1, 2, … p,
r=1

Ui ,Vp
where A is the adjacency matrix, W ∈ ℝn×k +
is a set of base
vectors, H ∈ ℝ+ is the community member matrix (k ≪ n), the Encoder term is defined by:
min E = ‖Vp − UpT U2T … U1T W‖2,1 , s.t. Ui ≥ 0, Vp ≥ 0, ∀i ∈ 1, 2, … p,
k×n

C is the node attribute matrix, Y is the community attribute Ui ,Vp


matrix, ‖.‖F is Frobenius norm, 𝛽 is a non-negative param- where Vp ∈ Rk×n is community membership matrix,
eter. Furthermore, in order to balance between structural Ui ∈ Rdi−1 ×di is the multi-layer mapping, such that, 1 ≤ i ≤ p,
topology information and node content, the ANMF model n = d0 > d1 > … > dp−1 > dp = k and di is to the dimen-
proposes a flexible trade-off function f based on community sion of the i-th layer, ‖.‖2,1 is l2,1 norm, this norm is chosen
modularity. This trade-off function has two variation f1 , f2 , because of its robustness against noise. Finally, the regulari-
which are defined as follows: zation term is given by the formula:
f1 (H, Y) = 𝛼|Q(ncl(H T )) − Q(ncl(CY))|, min Lreg = 𝜆tr(Vp LVpT ),
( ) Ui ,Vi
𝛼
f2 (H, Y) = ,
|Q(ncl(H T )) − Q(ncl(CY)) + 𝜖| where 𝜆 is a regularization parameter, L is the Laplacian
matrix of the adjacency matrix of the network, and tr(.)
where ncl(H) denotes the node community labels which are returns the trace of a matrix.
obtained from community member matrix H, ncl(CY) is the The discussed works in this subsection are just few illus-
node community labels which are obtained from the product trative examples of community detection approaches based
of node attribute matrix C and community attribute matrix on Unsupervised learning framework. For further reading on
Y, Q(.) is the modularity function (Chakraborty et al. 2017). relevant research contributions on this subject, the following
Finally, to find an optimal solution, that is to find: com- references can be consulted: (Lei et al. 2019; Gupta et al.
munity membership matrix H, and an optimal community 2018; Niu et al. 2023; Gholami et al. 2022) for Clustering
attribute matrix Y that minimize the value of the loss func- based approaches, Li et al. (2021), Xu et al. (2020), Zhang
tion. The ANMF model updates the parameters of the model et al. (2020), Cao et al. (2018), Wang et al. (2020), Al-
(W, H, Y) iteratively and simultaneously. The updating rules Andoli et al. (2022), Xie et al. (2019) for Autoencoder based
are mainly based on the Block coordinate descent approach approaches, Malhotra and Chug (2021), Bouyer and Roghani
(Jin et al. 2015). (2020), Nath et al. (2021), Ding et al. (2018), Li (2019), He
Al-sharoa and Rahahleh (2023) proposed a Deep Robust et al. (2019), Deng et al. (2019), Duan et al. (2019), Sat-
Auto-encoder Non-negative Matrix Factorization tari and Zamanifar (2018a), Sattari and Zamanifar (2018b),
(DRANMF) approach for community detection. The Rostami and Oussalah (2022), Laassem et al. (2022), Fang
approach considers the average of T-order proximity matrix et al. (2022), Luo and Xu (2022), Roghani et al. (2021) for
93 Page 18 of 47 Social Network Analysis and Mining (2024) 14:93

Label propagation based approaches, (Jin et al. 2019), Yan min𝜙g max𝜙d E(𝜙d , 𝜙g ).
and Chang (2019), Yan and Chang (2020), Roozbahani et al.
(2023), Shang et al. (2023), Chen et al. (2022) for Non nega- Zhang et al. (2020) proposed a framework based on a GAN
tive matrix factorization based approaches, and (Wu et al. for learning heuristics for community detection. The algo-
2018), Huang et al. (2021), Zhang and Zhou (2020) for Deep rithm is called Seed Expansion with generative Adversarial
non negative matrix factorization based approaches. Learning (SEAL). To model a Discriminator that determines
whether a community C ⊂ V is real or generated, the authors
used a Graph Neural Network (GNN) (Scarselli et al. 2008).
6.2 Semi‑supervised learning The output of the Discriminator consists of the probability
that community C being real, this probability is given by
Semi-supervised learning is a special form of classification the formula:
in which the amount of unlabeled data is much greater than
the amount of labeled data. Thus, the aim of semi-super- 𝔻(C, W) = Pr(C is real |W) = [1 + exp(wT3 z(C))]
̄ −1 ,
vised classification is to use unlabeled data points in order to
where W = {W0 , … , WL , w1 , w2 , w3 } is the set of param-
model a learner that outperforms the ones obtained by using
eters: Wj is the weight matrix of the ith layer of GNN,
only the labeled data.
w1 , w2 , w3 are vector of parameters; Z(C) ̄ = ∑ z(u) ,
A necessary condition of semi-supervised learning is that c∉C
and z(u) = [z0 (u), z1 (u), … , zL (u)] is the concatenation
the underlying distribution of the marginal data p(x) over the
of the representation of nod u at each layer of the GNN.
input space must contain information about the posterior
The Generator aims to generate a local community given
distribution p(y|x), such that y is the associated labels of the
a seed node u0 ∈ V . In this context, the authors used a
data from input space. In the case this condition is satisfied,
Greedy Community Generation algorithm inspired from
the algorithm can use unlabeled data to obtain information
Seed Expansion methods (Andersen and Lang 2006); at
about p(x), and consequently about p(y|x) (Zhu 2006).
each step t of the algorithm, given the actual partial solu-
In literature, there exist three main classes of semi-super-
tion Ct−1 = {u0 , u1 , … , ut−1 }, a node ut is peaked from the
vised learning approaches to detect communities in complex
boundary to expand the community Ct = Ct−1 ∪ {ut }, this
information networks: Generative adversarial network based
expansion is modeled according to a Reinforcement learning
approaches, Graph convolutional network based approaches,
algorithm that uses a policy G(at |u0 , Ct−1 , Q) parameterized
and Graph convolutional autoencoder approaches.
by Q, such that at ∈ A is the next action to take.
Wu and Chen (2020) proposed an algorithm base on GAN
6.2.1 Generative adversarial network named Graph Sparsification with Generative Adversarial
Network (GSGAN), which aims to sparsify networks in order
Generative Adversarial Networks (GANs) approaches aim to to detect communities. GSGAN depicts relatively important
train two neural networks: a Generator 𝜙g and a Discrimina- relationships that do not appear in the original graph, adds
tor 𝜙d in the adversarial framework (Goodfellow et al. 2014). artificial edges to represent these relationships, and enhances
The Generative network (the Generator) 𝜙g takes input the effectiveness of the community detection task. The gen-
data with a random distribution 𝜙d (x). Then, 𝜙g (z) learns a erator of the GAN intends to create random walks which can
generator distribution pg that represents the targeted prob- capture the structure of a network. On the other hand, the
ability distribution via input noise variables pz (z). On the discriminator tries to judge the authenticity of the random
other hand, the Discriminator 𝜙d takes an input data which walk, and considers the relationships between the nodes of
could be a true data x whose density is pdata or a generated the network based on this random walk. Finally, the authors
one x̂ whose density is pg . If the Discriminator 𝜙d takes designed a reward function in order to guide the generator
sets of true and generated data in the same proportions, its to produce random walks that embed useful nodes’ relation
expected absolute error can then be expressed as follows: information. The different random walks are then aggregated
E(𝜙d , 𝜙g ) =Ex∼Pdata(x) [log𝜙d (x)] to construct a new social network that is effective for com-
munity detection task.
+ Ez∼Pz [log(1 − 𝜙d (𝜙g (z)))].

The aim of the Generator is to deceive the Discriminator 6.2.2 Graph convolutional network
that tries to accurately separate between true and generated
data. Thus, the objective function of a Generative adversarial Let G = (V, E) be an undirected graph, |V| = n, |E| = m, and
network is given by: ̃ the self looped graph obtained by adding to each
̃ = (V, E)
G
node v in G a self loop. Let A, and à denote the adjacency
Social Network Analysis and Mining (2024) 14:93 Page 19 of 47 93

matrices of G, and G ̃ respectively, then à = A + In . Let D, where à = A + I , A is the adjacency matrix, D̃ is the diago-
and D ̃ be diagonal degree matrix of G and G ̃ respectively, nal degree matrix of à , W (l) is the weight matrix of the lth
then D ̃ = D + In. Consider X = [X1 , … , Xn ]T ∈ ℝn×d as node layer, 𝜎 is the activation function which is defined as
feature matrix, such that each line Xi ∈ ℝd is associated with 𝜎(x) = ReLU(x) = max(0, x) for all layers except the last
a node vi ∈ V . The normalized graph Laplacian matrix L exi
layer where the function 𝜎(xi ) = Softmax(xi ) = ∑ x is
is a symmetric positive semi-definite matrix, defined as ie
i
1 1
L = In − D 2 AD 2 where D is the diagonal degree matrix of used. On the other hand, the Decoder is used to reconstruct
A. The eigen decomposition of L is defined by UΛU T , such à , an approximation to the original adjacency matrix A from
that Λ is a diagonal matrix of eigenvalues of L, U ∈ ℝn×n is H, where every element à ij in à represents the probability
a unitary matrix in which each column is an eigenvector of that two nodes vi and vj are connected:
L. The graph convolution operation between a signal x and a 1
filter g𝛾 (Λ) = diag(𝛾) is defined as g𝛾 (L) ∗ x = Ug𝛾 (Λ)U T x , Pr(Ãij = 1) = sigmoid(Hi HjT ) = ,
1 + exp(Hi HjT )
such that 𝛾 ∈ ℝn is a vector of spectral filter coefficient
(Chen et al. 2020). where Hi is the ith column of the matrix H. In order to meas-
In the same vein, Yuan et al. (2023) proposed an approach ure the reconstruction loss between A and à , binary cross
based on graph convolutional network to learn the commu- entropy is used:
nity structure of a graph. First, using a two-layer graph con-
volutional network, the node embeddings matrix F with the 1 ∑
L=− log Aij Pr(Ã ij = 1)
maximal Markov stability is computed as follows: n2 ij

̂ Â X W 1 ) W 2 ), + (1 − Aij ) log(1 − Pr(Ã ij = 1)).


F = 𝜎(A𝜎(

where W 1, W 2 are the trainable parameters, and 𝜎 is ReLu Then, to allow H to indicate the community belongingness,
activation function. In order to optimize F, the parameters Newman modularity maximization is used (Newman 2006).
W 1 and W 2 are iteratively updated depending on the value of Finally, semi-supervised learning module is proposed in
the loss function L which is defined by the Markov stability order to incorporate Y, the known community belongingness
as follows: of some nodes. This can enhance the accuracy of community
detection in the network. In this context, crossentropy loss
L = −tr(F T [ΠP(t) − 𝜋 T 𝜋]F), is used as the semi-supervision loss:
where 𝜋 is the stationary distribution of the Markov chain, ∑∑
k

Π = diag(𝜋), P(t) = M T , and M is the transition matrix. This Ls = − Yij ln(Hij ),


process is repeated until convergence. Then, F is converted i∈Y j=1

to the community affiliation matrix H using a threshold p. where Y is the set of labeled nodes.
Indeed, if Fuc > p then the node u is considered being in De Santo et al. (2021) proposed a Semi-supervised
community c, as consequence Huc = 1 otherwise Huc = 0. approach for community detection based on Convolutional
Neural Networks (CNN). Their main approach was to simul-
taneously depict different features of an interconnection
6.2.3 Graph convolutional autoencoder
network, such as topological characteristics and contex-
tual information. First, they feed the CNN input layer by
Some research works introduced Graph Convolutional Net-
a two-dimensional user-to-user matrix whose components
works into Autoencoders. Indeed, Autoencoders can be a
could represent many types of information such as similar-
good solution to overcome oversmoothing issues and help
ity, friendships, and so forth. Segments of this adjacency
to highlight community boundaries. In this context, He
matrix are extracted by considering individual rows. Indeed,
et al. (2022) proposed a semi-supervised approach based
each row is transformed into a two-dimensional matrix. The
on Graph Convolutional Autoencoder (GCAE) to detect
result is a set of n adjacency matrices, where n is the number
overlapping communities in an attributed graph G(V, E).
of users in the network. Each matrix represents adjacency
First, they proposed a version of GCAE that consists of an
relationships (similarity, friendships, trust, etc.) between a
Encoder and a Decoder. The Encoder aims to learn hidden
given user and the rest of the network. The idea consists on
representation H of G, it is based on a multi-layer graph
representing network connections over a particular sparse
convolutional network with the following propagation rule:
matrix using SparseConv2D algorithm. Then, the different
H (l) = 𝜎(D
1 −1
̃ 2 H (l−1) W (l) ),
̃ 2 Ã D convolution operations are performed just around the non-
zero components.
93 Page 20 of 47 Social Network Analysis and Mining (2024) 14:93

6.3 Supervised learning 6.3.2 Support vector machine

Let Dl = {(x1 , y1 ), … , (xl , yl )} be a set of l labeled data Given a training set Dl = {(xi , yi )}li=1 , xi ∈ ℝd , yi ∈ {−1, 1},
points. In the context of community detection, each data the Support Vector Machine (SVM) classifier takes the form
point (xi , yi ) consists of a labeled node xi ∈ ℝd in the net- of y(x) = sign[wT Φ(x) + b], such that Φ ∶ ℝd → ℝm , (m ≥ d)
work, where d is the number of features, and an associated is a non-linear function that maps an input space to a higher
label yi which represents the community to which the node xi dimensional feature space, b is bias term. To evaluate the
belongs. Based on the set Dl , known as the training dataset, SVM classifier, the following optimization problem is
∑l
supervised learning approaches aim to infer a classifier that defined: min L(w, 𝜉) = 12 wT w + c k=1 𝜉k subject to:
w,b,𝜉
could predict accurately the label (the community) ŷ of each {
new input unlabeled featured node x̂ . Another set of labeled yk [wT Φ(x) + b] ≥ 1 − 𝜉k , ∀ k ∈ {1, … , l},
nodes (data points) is called test dataset on which the clas- 𝜉k ≥ 0, ∀ k ∈ {1, … , l},
sifier is evaluated.
where 𝜉 aims to allow misclassifications due to overlapping
Supervised learning framework in the context of com-
distribution, and c ≥ 0 is a tuning parameter. The minimi-
munity detection includes five main approaches: Nearest
zation of ‖w‖2 corresponds to maximization of the margin
neighbors, Support vector machine, Recurrent neural net-
(separating distance) between the two classes y = +1 and
work, Deep reinforcement learning, and Convolutional neu-
y = −1.
ral network.
In the context of community detection, Sui et al. (2016)
introduced an SVM based algorithm to detect communities
6.3.1 Nearest neighbors in networks. The feature vector xi ∈ ℝd of a node vi is
obtained from the transformation of topological information
K Nearest Neighbors (K-NN) is a simple, supervised learn- of the network. Thus, the jth feature of the feature vector xi
ing algorithm that can be efficient in classifying nodes relative to node vi is computed by the formula:
into communities. In complex information networks, 1
every node in the network is represented by a vector of xij = k )+1
, where d(v, v� ) is the length of the short-
d(vi , Vmax
its features values. Given a training set of network nodes est path in the network form node v to node v′ and Vmax k
is
xi ∈ Dl , Dl ⊂ ℝd , |Dl | = l , every node xi ∈ Dl is associated maximum node of k-degree. This proposed algorithm pre-
with its community label yi ∈ {1, … , M}. Then, to classify dicts community structure of a network based on a little
any unlabeled node in the network xj ∉ Dl, its distances from amount of test data. This is a useful trait, since a genuine
the data points of the training set Dl are computed. Then, complex network is sparse and does not have many connec-
the K nearest neighbors are selected and xi is assigned to the tions between its vertices.
community containing the maximum number of its nearest In the same vein, Nema and Pandey (2015) proposed an
neighbors. approach for detecting community kernels in large social
In this sense, Jia et al. (2017) proposed a community networks based on both SVM and community kernels
detection algorithm called KNN-enhance that adds the K-NN classification.
graph of node attributes to mitigate the issues of the sparsity
and the noise effect (false edges) of the network. Indeed, by
adding an edge between nearest neighbors having common 6.3.3 Recurrent neural network
features, the resulting attribute-enhanced network shows a
more strengthened community structure. In addition to that, In Recurrent Neural Network (RNN) approaches (Sutsk-
a community detection approach such as K-rank-D (Li et al. ever et al. 2014), given a sequence of input data instances
2015) could be applied to depict communities in the result- (x1 , … , xT ), the sequence of outputs (y1 , … , yT ) is computed
ing network. as follows:
Furthermore, Shang et al. (2017) proposed a labeling
ht = 𝜎(W x xt + W h ht−1 ),
algorithm that combines the K-NN and a label propagation
algorithm. The algorithm iteratively labels a node in a net- yt = W y ht ,
work using the labels of its adjacent nodes and their index of where W x , W h and W y are weight matrices of the RNN, and
closeness, which allows to automatically generate sub-com- ht is the state transition function at state t.
munities. Then, it uses mutual membership of two adjacent The RNN can map input sequences to output sequences
sub-communities in order to eventually merge them. Finally, when the alignment between the inputs and the outputs is
a refinement strategy is proposed to update labels of some known. However, when the input and the output sequences
incorrectly classified nodes at communities boundaries. have different lengths, it would be more convenient to use
Social Network Analysis and Mining (2024) 14:93 Page 21 of 47 93

Long Short-Term Memory (LSTM) Neural Networks (Sut- AC2CD is based on deep reinforcement learning, it uses
skever et al. 2014). LSTM aims to estimate the conditional the message-passing feature of Graph Attention Network
probability Pr(y1 , … , yT � |x1 , … , xT ) such that (x1 , … , xT ) (Velickovic et al. 2017) to propagate the labels of the differ-
is an input sequence, and (y1 , … , yT � ) is its corresponding ent communities. The approach chosen for the Reinforce-
output sequence. ment Learning is Actor-Critic (A2C) method, which aims to

optimize simultaneously the policy and the value function

T
with proximal policy optimization described by Schulman
Pr(y1 , … , yT � |x1 , … , xT ) = Pr(yt |v, y1 , … , yt−1 ),
t=1
et al. (2017).

such that v is the fixed dimensional representation of the 6.3.5 Convolutional neural network
input sequence (x1 , … , xT ) which is given by the last hidden
state of LSTM. Convolutional Neural Network (CNN) (Yu et alo. 2014) is a
In this context, Ali et al. (2023) proposed a new frame- type of Feed-Forward Deep Neural Network which is very
work based on graphs and Long Short Term Memory Gated adequate for grid like topology data. A CNN has two main
Recurent Units (LSTM-GRU) to detect hate contents in social types of hidden layers: the Convolution layers and Pool-
network. In addition to that, they described a new approach to ing layers. A convolution layer aims to depict patterns from
depict hate contents communities. First, they developed anno- local regions of the input data xk , by computing the feature
tated data to train the model and detect communities. Then, map yk using linear convolutional filter w followed by a non-
the authors proposed various models based on customized linear activation functions f, such as: ReLU, tanh, sigmoid,
LSTM and RNN for classifying tweets. Within this scope, they etc. as follows:
combined LSTM with GRU algorithms and arranged layers in
order to allow the approach to recognize text patterns. Indeed, yk = f (xk ∗ w).
a user who shares a hate content tends to retweet offensive
On the other hand, a pooling layer aims to transform the
contents. This information was extracted and the graph estab-
feature map into a more convenient representation that pre-
lished. Finally, hate-spreading communities were detected
serves only relevant information and discards irrelevant one.
from the trained LSTM-GRU graph model using the Girvan-
This insures robustness to cluster, invariance to distortions,
Newman algorithm (Girvan and Newman 2002).
and a more compact representation. Within this scope, there
are mainly two pooling methods: Max Pooling and Average
6.3.4 Deep reinforcement learning
Pooling. The Max Pooling selects the largest value in its
input Pooling Region Rij around a position (i, j). The output
In Reinforcement Learning (RL), we distinguish two main
ykij of the Max Pooling related to the kth feature map and the
elements: the agent and the environment. The agent oper-
region Rij is defined as follows:
ates in the environment and is responsible for taking actions
which can change the environment, and getting rewards for ykij = max xkpq .
(p,q)∈Rij
each chosen action. This reward can have a positive or a
negative value. More specifically, at each instant t, depend- While the Average Pooling computes the kth feature by
ing on the environment dynamics, the agent receives a state using the arithmetic mean of the elements in its input pool-
s(t) ∈ S from a state space S and decides to take an action ing region Rij , as described in the following equation:
a(t) ∈ A from an action space A, following a predefined pol- ∑
1
icy 𝜋(a(t) |s(t) ) (i.e. a mapping from state s(t) to action a(t) ). ykij = x ,
Thus, the agent receives a scalar reward rs,a (t) defined by a |Rij | (p,q)∈R kpq
ij

reward function:
such that xkpq is the kth component of the point (p, q) ∈ Rij.
R ∶ S × A → ℝ, In this context, Xin et al. (2017) proposed one of the
(s(t) , a(t) ) ↦ R(s, a) = rs,a
(t)
, first model of Convolutional Neural Network (CNN) for
community detection. Every node i in the network is rep-
and makes a transition to the next state defined by the state resented by a feature vector vi ∈ ℝn , where n is the number
transition probability Pr(s(t+1) |s(t) , a(t) ). The steps of rein- of nodes. Every component vij , j ∈ {1, … , n} of the vector vi
forcement learning are iterated until the agent reaches a takes a value between 0 and 1 and is given by the formula:
terminal state. vij = e𝜎(1−s) if the node j is reachable from i within at least
In this context, Costa and Ralha (2023) proposed an archi- s steps (1 < s < s0 ), note that 𝜎 ∈ [0, 1] is an attenuation
tecture called Actor-Critic Community Detection (AC2CD) parameter. The obtained feature vector is than transformed
for community detection in large dynamic social networks. into a 2-dimensional w × h (s.t. w × h = n ) matrix without
93 Page 22 of 47 Social Network Analysis and Mining (2024) 14:93

altering the adjacency relations of nodes. The first convolu- 6.4 Concluding remarks and discussion
tional layer of the CNN has c1 convolutional kernels. Each
kernel is represented as w� × h� matrix. Once the convolu- Table 5 describes a comparative study of some relevant
tional operation is applied over the social network, a feature research works relative to community detection approaches
map can be obtained as follows: based on Machine learning model.
( w� −1 h� −1
)
∑∑ • Machine learning approaches are very powerful for
vnxy = 𝜎 bW + Wij Pn(x+i)(y+j) ,
i=0 j=0 detecting communities based on network node features.
However, the lack of access to real complex network data
where Pn(x+i)(y+j) is the entry value of position (i, j), Wij is the such as logs may be more restrictive compared with the
weight at position (i, j) of the convolutional kernel W and bW other approaches that relay only on the graph informa-
is the bias. Thus, the first convolutional layer maps every tion. This can be mitigated by using the in-the-cloud
node to c1 feature maps. The last layer is a fully connected preprocessing (convolution, encoding, etc.) to prepare
layer consisting of K output neurons, where the value of K the data for classification while making the data more
is the number of communities. The value of the kth output confidential for a human being.
neuron of node n (denoted as okn ) is defined as follows: • Even though unsupervised machine learning seems to
be the predominant framework for community detection
f f
okn = 𝜎(bk + Wk qcn1 .c2 ), based on Machine learning model, the lack of informa-
tion could be a major constraint. Indeed, as there are
where qn1 2 is the output of the second convolutional
c .c
numerous forms of interaction, node features, etc. logs
layer, Wk represents the weights in the full connection
f
and networks datasets will always lack of critical infor-
layer, and bk is its the bias. To learn the model param-
f
mation for automatically training an accurate predictive
eters P = (W, W f , b, bf ), the CNN uses back-propagation. model.
Indeed, let {(Sn , In ) ∶ 1 ≤ n ≤ T} be a set of T training • Though supervised learning based approaches may seem
examples, such that Sn is the adjacency relation of a node to be more relevant in classifying nodes into known
n, In = (In1 , In2 , … , Ink ) is the labels vector of a node n, where classes (communities), labeling data is an expensive
Ini ∈ {0, 1}, ∀i ∈ {1, … , k} and represents whether or not a task, as it may require in many cases experienced human
node n belongs to the ith community. The cost function is annotators.
given by the formula: • Another point to note is that the majority of approaches
based on Machine learning model that are used in com-
1� 1 �� k
T T K
J(P) = ‖on − In ‖2 = (o − Ink )2 , munity detection are time-consuming. In fact, their time
2 n=1 2 n=1 k=1 n complexities are generally quadratic (Niu et al. 2023),
polynomial of higher degree (Lu et al. 2020) or even
J(P) is optimized by back-propagation.
exponential (Gupta et al. 2018). Thus, they could pre-
On the other hand, Cai et al. (2022) proposed MFF-Net,
sent convergence time issues when dealing with large
a Multi-Feature Fusion Network for community detection.
networks, which is a common challenge in the big data
This approach intends to overcome feature representation
era. Nevertheless, the majority of approaches are either
issue which is due to the limitation of manual definition of
distributed or parallelizable. In this context, massive
relationship between nodes and local feature representations
computing, parallelization and distributed computing
of edges. For this, the authors first propose to use only local
on the cloud seems to be worthwhile solutions.
features of the edges. Then, they consider both local and
non-local relationships of the edges through nodes’ neigh-
borhood and random walk nodes’ sequence. After that, in
order to classify edges, they introduced a quantitative rela- 7 Multi‑objective optimization model
tionship between nodes by converting the local and non-
local feature representations into grayscale images. Finally, Let Φ = {C1 , … , Cm } be a set of all feasible clustering solu-
and more importantly, the authors proposed a CNN-based tions in a complex networks. Multi-objective optimization
local and non-local feature fusion scheme to enhance the aims to find a solution that optimizes some predefined cri-
performance of edge classification. teria known as fitness functions.
In the case of single criterion, community detection
can be formulated as a single objective optimization prob-
lem (Φ, 𝜌), where 𝜌 ∶ Φ → ℝ is the fitness function which
Table 5  Comparative table of characteristics of complex network communities based on Machine learning model
Classification Works Criteria
Model Framework Approach Com- Links Distribu- Complexity Scalability Application
munity tion
type

Machine Unsuper- Clustering Gupta et al. (2018) Parti- Undirected Distrib- O(ndim×k+1 ), dim: dimension Considered Service composition
learning vised tioning uted
Lu et al. (2020) Overlap- Undirected Paralleliz- O(r × n3 + p × n2 + n2 kt), p : membership Considered Community detection
ping able degree, t : number of the fuzzy clustering in complex networks
Social Network Analysis and Mining

algorithm iterations, r : number of steps


in the random walk
Niu et al. (2023) Overlap- Undirected Paralleliz- O(N 2 ) Considered Community detection
ping able in social networks
Gholami et al. (2022) Overlap- Undirected Central- – Not consid- Social circle detection
(2024) 14:93

ping ized ered in social networks


Autoen- Al-Andoli et al. (2022) Overlap- Undirected Parallel – Considered Community detection
coder ping in complex network
Salha-Galvan et al. Parti- Undirected Paralleliz- Linear complexity Considered Community detection
(2022) tioning able in complex networks
Zhou et al. (2023) Parti- Undirected Paralleliz- O(T1 (k × logm + n × F × DIM + m × DIM) Considered Community detection
tioning able +ndimi k + T2 (n × k + nlogn), dimi ∶ in complex network
dimension of the ith layer, F : number of
attributes, DIM = max(dimi )
Label Rostami and Oussalah Parti- Undirected Paralleliz- O(n) Considered Community detection
propaga- (2022) tioning able in social network
tion Laassem et al. (2022) Parti- Undirected Paralleliz- O(nm) Considered Community detection
tioning able in complex network
Fang et al. (2022) Parti- Undirected Paralleliz- O(n + m) Considered Community detection
tioning able in complex networks
Luo and Xu (2022) Parti- Undirected Paralleliz- O(nd) Considered Community detection
tioning able in complex network
Hosseini-Pozveh et al. Parti- Directed ̄
Paralleliz- O(mt + m(d)log( ̄
(d))), t : number of itera- Considered Community detec-
(2022) tioning (signed) able tions tion in signed social
network
Traag and Šubelj Parti- Undirected Paralleliz- Near linear Considered Community detection
(2023) tioning able in complex networks
Page 23 of 47
93
Table 5  (continued)
93

Classification Works Criteria


Model Framework Approach Com- Links Distribu- Complexity Scalability Application
munity tion
type
Page 24 of 47

Non- Liu et al. (2023) Parti- Undirected Paralleliz- Quadratic with the number of nodes Considered Community detection
negative tioning able in complex network
matrix Roozbahani et al. Overlap- Directed Paralleliz- O(n × l × m × k2 ), l : number of layers in Considered Community detection
factoriza- (2023) ping able multilayer network in multi-relational
tion networks
Huang et al. (2021) Parti- Undirected Paralleliz- – Considered Community detection
tioning able in complex networks
Fang and Lin (2022) Overlap- Undirected Paralleliz- O(n2 × log(n)) Considered Community detection
ping able in complex networks
Shang et al. (2023) Parti- Undirected Paralleliz- O(T(mk + bk)), b : number of 1 in attribute Considered Communiy detection in
tioning able matrix complex networks
Chen et al. (2022) Parti- Undirected Central- O(n2 k) Not consid- Community detection
tioning ized ered in complex networks
Deep non- Zhang and Zhou (2020) Parti- Undirected Paralleliz- O(m(ta + tb )(n2 S + nS2 )), S : max size of Scalable Social circles detection
negative tioning able all layers, ta ∶ number of iterations for
matrix convergence, tb ∶ number of iteration for
factoriza- updating stage factorization
tion Zhao et al. (2022) Parti- Undirected Paralleliz- Complexity of one iteration: O(n2 k + nkt), Not consid- Community detection
tioning able n : number of nodes, k : number of com- ered in attributed graphs
munities, t : number of node attribute
Al-sharoa and Rahahleh Parti- Undirected Paralleliz- O(l(tp + tf )(n2 d + Sl2 ))tp tp : number of Considered Community detection
(2023) (2023) tioning able iteration in training step, tf ∶ number of in complex network
iteration in the fine-tuning step, S : maxi-
mum layer size, l : number of layers
Semi-super- Generative Zhang et al. (2020) Overlap- Undirected Paralleliz- – Considered Community detection
vised Adver- ping able in complex networks
sarial Wu and Chen (2020) Parti- Undirected Paralleliz- – Considered Community detection
Network tioning able in complex networks
Graph con- Yuan et al. (2023) Overlap- Undirected Paralleliz- – Considered Community detection
Social Network Analysis and Mining

volutional ping able in complex network


network
Graph con- He et al. (2022) Overlap- Undirected Paralleliz- O(mfd1 d2 … dk + n2 k), f : size of attributes Considered Community detection
volutional ping able sets, di ∶ internal degree of community i in attributed graph
autoen-
coder
(2024) 14:93
Table 5  (continued)
Classification Works Criteria
Model Framework Approach Com- Links Distribu- Complexity Scalability Application
munity tion
type

Supervised Neighbor Jia et al. (2017) Parti- Undirected Central- O(n × dim × 𝛼), dim:dimension,𝛼 : hyperpa- Considered Community detection
based tioning ized rameter in complex networks
Shang et al. (2017) Parti- Undirected Paralleliz- – Not consid- Community detection
tioning able ered in complex networks
Social Network Analysis and Mining

SVM based Nema and Pandey Parti- Undirected Central- O(n3 ) Not consid- Social trust circle
(2015) tioning ized ered detection
Sui et al. (2016) Parti- Undirected Central- – Not consid- Community detection
tioning ized ered in complex networks
Recurrent Ali et al. (2023) Parti- Undirected Paralleliz-
– Considered Community detection
(2024) 14:93

Neural tioning able in social network


network
Deep Rein- Costa and Ralha (2023) Patition- Undirected Paralleliz- – Considered Community detection
forcement ing able in dynamic social
learning network
Convolu- Cai et al. (2022) Parti- Undirected Paralleliz- O(n3 ) Considered Community detection
tional tioning able in complex networks
neural
network

n: number of nodes in the network. D: average degree of nodes. K: number of clusters. m: number of edges, dim: dimension
Page 25 of 47
93
93 Page 26 of 47 Social Network Analysis and Mining (2024) 14:93

intends to find an optimal clustering solution C∗ ∈ Φ as incrementally formed. These components represent the hier-
follows: archical community structure of the network.
Within this scope, Girvan and Newman (2002) proposed
𝜌(C∗ ) = min 𝜌(C).
C∈Φ an algorithm for detecting a hierarchical community struc-
ture of a network. First, betweenness centrality is computed
In the case of multicriteria, community detection can for every edge in the network, betweenness centrality of an
be formulated as a multi-objective optimization prob- edge (vi , vj ) is defined as the number of shortest paths in the
lem (Φ, P) of finding a possible dominant solution C∗ , network that contain the edge (vi , vj ). Then, iteratively, the
where P = (𝜌1 , … , 𝜌k )T is a vector of k competing crite- edges are removed one by one from the network, starting
ria functions that should be simultaneously optimized, with the ones with the highest betweenesses, and after each
𝜌i ∶ Φ → ℝ, i ∈ {1, … , k}. More formally, the aim is to find removal the betweenesses are recomputed for every affected
a dominant clustering solution C∗ ∈ Φ, where: edge. This process is repeated until no edge remains in the
𝜌i (C∗ ) = min 𝜌i (C), ∀i ∈ {1, … , k}. network.
C∈Φ
Li et al. (2019) proposed a hierarchical community detec-
Since generally a dominant solution does not exist, Pareto tion approach based on edge-weighted similarity. First, a
optimality is used to find clustering solutions. Given directed social network is constructed by considering
C1 , C2 ∈ Φ , we say that C1 dominates C2 , denoted as directed fellowship relationship, vertices in this initial net-
C1 ≺ C2 , if and only if: work are presented by interest vectors. Then, this network is
converted into a new one that includes undirected weighted
∀i ∈ {1, ..., k} ∶ 𝜌i (C1 ) ≤ 𝜌i (C2 ) ∧ ∃j ∈ {1, … , k} ∶ 𝜌i (C1 ) < 𝜌i (C2 ). edges. After that, weights are computed by the direction,
Multiobjective optimization generates a set of non-domi- the interest vectors, and the similarity between edges. Next,
nated solutions, called Pareto front Π: communities are detected by a hierarchical clustering algo-
rithm based on the edge-weighted similarity. Finally, the
Π = {C ∈ Φ ∶ ∄C� ∈ Φ with C� ≺ C}. number of communities is detected by the partition density.

Then, the Pareto front Π is presented to an expert of the 7.1.2 Agglomerative


domain on which community detection is applied to select
the most suitable solution. The agglomarative approaches depict the ascendant hier-
In the context of multi-objective optimization, there are archical community structure of a network. As in divisive
two main frameworks currently adopted in research litera- approaches, a weight function W maps every edge (vi , vj )
ture: Hierarchical framework and Meta-heuristic frame- to a real number Wij which represents connection close-
work. In the following subsections we describe these frame- ness between the pair of nodes (vi , vj ) . Generally, these
works and discuss some of relevant related research works. approaches start by considering every node as a separate
community. Then, the communities are iteratively merged
7.1 Hierarchical according to some specified criteria. As the algorithm is
executed a set of increasingly expending connected compo-
The objective of Hierarchical approaches is to detect a hier- nents are gradually constructed. These components represent
archical community structure of a network. In the context of the hierarchical communities.
community detection, we distinguish two main approaches In this vein, Blondel et al. (2008) proposed Louvain
in literature: Divisive and Agglomerative. heuristic, a modularity based agglomerative approach for
community detection. The Louvain algorithm has two
7.1.1 Divisive phases, which are repeated iteratively. In the first phase,
each node of the network is assigned to a different com-
Generally, these approaches follow three main steps: first, munity. Then, iteratively, some nodes are removed from
a weight Wij is computed for every pair of nodes vi , vj in their communities and inserted into different communities.
the network. These weights represent connection tightness These actions are determined by evaluating the gain of
between nodes or distance (like similarity). Then, the whole modularity (Newman 2006) obtained by making such per-
network is considered as a global community. After that, the mutations. This step is repeated until no further improve-
edges are taken off from the network, one by one, accord- ment is possible. The second phase consists in construct-
ing to their weights, starting by the edges with the lowest ing a new network from the community structure obtained
weights or the highest. With this removing of the edges, from the first phase. Hence, nodes belonging to the same
a hierarchy of decreasing size of connected components is community obtained in the first phase are merged into a
single node. Edges between nodes are the sum of the edges
Social Network Analysis and Mining (2024) 14:93 Page 27 of 47 93

connecting nodes from the same different communities. S ∶ I 𝜆 ∪ I 𝜇+𝜆 → I 𝜇 is a selection operator where 𝜇, 𝜆 ∈ N
Self-loops are generated by summing all edges inside a and 𝜆 ≥ 𝜇 , 𝜇 is the number of parent individuals, 𝜆 is the
given community. The two phases are repeated until no number of offspring individuals, 𝜄 ∶ I 𝜇 → {true, false} is ter-
more changes are possible and a maximum of modularity mination criterion, Ψ = So𝜔𝜃i o … o𝜔𝜃i describes steps of
is attained.
1 o
transforming population through population sequence:
The main challenge of the sequential Louvain approach
is how to deal with large-scale graphs. Thus, many Pop(0),Pop(1), Pop(2), … , Pop(n) ∈ I 𝜇 s.t.
researchers have proposed several parallel Louvain meth- Pop(t + 1) = Ψ(Pop(t)), ∀ t ≥ 0.
ods (Que et al. 2015). However, these methods suffer from
two main drawbacks: latency in the information synchro- Malhotra (2021) proposed a Hybrid Genetic Algorithm with
nization and communities swap (Blondel et al. 2008). To Link Strength-based local search strategy (HGALS) for
overcome these drawbacks, Blondel et al. (2008) proposed community detection. This method tries to optimize modu-
a new graph partition algorithm for the parallel Louvain larity function using genetic algorithm. First, initial popula-
method. The idea consists in dividing the graph into iso- tion is represented by string encoding, i.e. each solution is
late sets, which are defined as follows: let G(V, E) be a encoded as C = [c1 , c2 , … , cn ], where ci ∈ {0, 1, … , k − 1}
graph network, a subset of nodes s ⊂ V is called an isolate represents community number of node vi, n denotes the num-
set, if ∀vi , vj ∈ s and vi ≠ vj , N + (vi ) ∩ N + (vj ) = �, such that, ber of nodes in the graph and k the number of communities.
N + (v) is the dependency set of v ∈ V , N + (v) = {v} ∪ N(v), The initialization of the population is based on the safe and
N(v) is the set of neighbor nodes of v, the vertices in the balanced initialization approach described by Guerrero et al.
isolate sets are somehow decoupled from others. Finally, (2017). Furthermore, HGALS algorithm uses the border
the algorithm computes and synchronizes information exchange crossover operator (Guerrero et al. 2017), such that
without latency and communities swap. the nodes belonging to two different communities exchange
For further reading on the subject of community detec- their community labels in the solution representation. This
tion based on Agglomerative based multi-objective optimi- exchange depends on the number of external connections
zation approaches, many other substantial contributions in of each node to the other communities. Moreover, the algo-
this context can be found in the following research works rithm uses a neighbor-based mutation operator (De Nooy
(Lalwani et al. 2015; Qie et al. 2022; Zhao et al. 2023; Liu et al. 2018) to explore better solution set. Finally, HGALS
and Ma 2019; Haq et al. 2019). uses link strength-based local search operator, which merges
different communities to obtain a reduced set of more opti-
7.2 Metaheuristic mized ones. This local search allows a correct affectation of
nodes to the communities in which they have a tight bonding
Community detection is an NP- problem. Thus, research with other nodes.
works generally opt for approximating solutions. These Bello-Orgaz et al. (2018) proposed Multi-Objective
solutions are found using approaches based on gen- Genetic Algorithm for Overlapping Community Detection
eral algorithmic structures which can be easily adapted (MOGA-OCD). The algorithm uses phenotype-type encod-
to solve variety of optimization problems. These are ing that is based on edge information. Therefore, in a chro-
known as Metaheuristics. Community detection based on mosome (a solution) a position (allele) i represents the edge
Metaheuristic framework uses three relevant classes of ei and its value is randomly selected from its adjacent edges.
approaches: Evolutionary inspired, Nature inspired, and MOGA-OCD tries to maximize two objective functions:
Swarm inspired. the first objective function is based on internal measures
(Density, Triangle Participation Ratio, Clique Number and
Clustering Coefficients) and it ensures the maximization of
7.2.1 Evolutionary inspired the internal edge density of communities. On the other hand,
the second objective function is based on external measures
An evolutionary algorithm is defined as 8-tuple (Expansion, Separability and Cut Ratio), this objective func-
EA = (I, 𝜌, Ω, Ψ, S, 𝜄, 𝜇, 𝜆), such that I is a space of potential tion aims to minimize inter-community connections. The
individuals representing clustering solution, 𝜌 ∶ I → ℝ is the MOGA-OCD algorithm iteratively selects two individuals
fitness function to optimize, Ω = {𝜔𝜃1 , … , 𝜔𝜃k |𝜔𝜃k ∶ (chromosomes) from the n-best individuals, then it performs
I 𝜆 → I 𝜆 } ∪ {𝜔𝜃0 ∶ I 𝜇 → I 𝜆+𝜇 } is a set of probabilistic genetic one-point crossover (Umbarkar and Sheth 2015) operations
operators, namely mutations and crossovers, such that each over them. Finally, it selects randomly some individuals
operator 𝜔𝜃i is controlled by a set of parameters 𝜃i ⊂ ℝ , from the new generated ones and using predefined mutation
probability, mutations are performed over them.
93 Page 28 of 47 Social Network Analysis and Mining (2024) 14:93

7.2.2 Nature inspired authors redefined it into a discrete framework for commu-


nity detection. In this context, the nutrient function (fitness
Many metaheuristic optimization approaches based on function) chosen to be optimized is the modularity function
nature phenomena for community detection have been pro- (Newman 2006). The evolutionary aspect of the bacterial
posed in literature, namely, Simulated annealing, Firefly foraging was developed from a topological perspective. In
algorithm, Geography-based algorithm, etc. (Amiri et al. addition to that, two local updating rules, the greedy strat-
(2013), Zhang et al. (2020)). egy and the stochastic strategy, were proposed to adjust the
In this context, Zhang et al. (2020) introduced the positions of the swarm of bacteria into the favored regions.
Whale Optimization-based Community Discovery Algo-
rithm (WOCDA) as a new community detection approach. 7.2.3 Swarm inspired
WOCDA consists in four mains steps that simulate the
humpback whale hunting behavior: an initiation technique To optimize an objective function f, let consider a
and three operations, shrinking encircling, spiral updating, swarm with M particles. At iteration t, for each particle
and random searching. The final goal is to detect an optimal pi ∈ {p1 , … , pM }, a position vector xit = (xi1 (t) (t) T
, … , xin ) and
community structure of the network by optimizing the fol- a velocity vector v(t) = (v(t)
, … , v(t) T
) are defined. At each
i i1 in
lowing fitness function: (t) (t)
iteration t, xi and vi are updated through each dimension

k
2𝜆L(Vi , Vi ) − 2(1 − 𝜆)L(Vi , V̄i ) j ∈ {1, … , n}, following these two equations:
D𝜆 = , {
i=1
|Vi | v(t) [j] = v(t) [j] + c1 r1(t) (pbesti [j] − xi(t) [j]) + c2 r2(t) (gbest [j] − xi(t) [j]),
i i
xi(t+1) [j] = xi(t) [j] + v(t+1) [j],
where Vi , i ∈ {1, ..., k}, is the set of nodes of Gi , a subnet- i

work of G which represents the ith community, L(Vi , Vi ) is where pbesti is the ith particle’s best position that optimizes
the inner degree of Gi , L(Vi , V̄i ) is the outer degree of Gi , the objective function f, gbest is the best position found until
𝜆 ∈ [0, 1] is a parameter used to explore the network at dif- the current iteration t, c1 and c2 are random parameters.
ferent resolutions. To begin, a new initialization method In this setting, Rahimi et al. (2018) proposed a new
based on label diffusion and label propagation is proposed multi-objective community detection approach based on
in order to produce a high-quality initial solution. For this, a modified version of Particle Swarm Optimization called
the label of each node vi in the network is set as li = i . Then, MOPSO-Net. The objective criteria to be optimized are ker-
a pair of nodes are picked by a binary tournament selection nel k-means (KKM) and ratio cut (RC). The kernel k-means
(Eremeev 2018). After that, their labels are assigned to the is given by the following equation:
neighbors using label diffusion approach. Finally, the algo-
rithm updates all the labels according to label propagation ∑
m
L(Vi , Vi )
KKM = 2(n − m) − ,
strategy. This process is defined by the following equation: |Vi |
i=1

li = argmaxi 𝛿(li , lj ), where n is the number of nodes, m is the number of com-
j∈Γ(i) munities, Vi is the set of nodes of the subnetwork Gi from
where Γ(i) is the neighboring nodes of vi , and 𝛿(li , lj ) = 1 if the global graph G, L(Vi , Vi ) is the inner degree of Gi , |Vi | is
li = lj , and 𝛿(li , lj ) = 0 if li ≠ lj. the cardinal number of Vi . On the other hand, the ration cut
Then, using label propagation, a shrinking encircling pro- (RC) is described by the equation:
cess was proposed to update the current node’s label with ∑
m
L(Vi , V̄i )
the label of its most neighboring nodes. This operation is RC = ,
|Vi |
executed |Pse × N|, such that Pse denotes the probability of i=1

executing the shrinking encircling operation and N is the where L(Vi , V̄i ) is the outer-degree of Vi . Finally, to enhance
number of nodes. Then, the one-way crossover operator the general PSO approach, MOPSO-Net changed the mov-
establishes a spiral update operation to maintain excel- ing strategy of particles. Thus, to move toward personal best
lent communities. Finally, a random searching operation is pbesti , each particle pi , i ∈ {1, … , M} executes a two-point
built to randomly select the label of a neighboring node and crossover with its personal best. Then, the eventual dominant
update the label of the present node in order to improve solution is selected as temporary position of the particle.
global search capabilities. However, if no solution dominates the other, an output with
In the same vein, Yang et al. (2022) proposed a new Bac- the highest Normalized Mutual Information value (NMI)
terial Foraging Optimization method to detect communities
in networks. As the bacterial foraging algorithm was origi-
nally used for continuous optimization (Das et al. 2009), the
Social Network Analysis and Mining (2024) 14:93 Page 29 of 47 93

(Amelio and Pizzuti 2015) is selected as temporary position. • The comparative study presented in Table 6 highlights
To move toward the global best position gbest , a two-point that even though many research works using a single
crossover between its temporary position and the global best type of interaction (simple networks with one layer) were
is executed. The two resulting outputs are compared together dedicated for optimal static graph partitioning, further
to specify the new position of the global best. investigation would be necessary to dive into overlapping
In the same scope, Sun et al. (2023) proposed a Core community detection problem, multilayer networks, and
Node Knowledge Based multi-objective Particle Swarm dynamic networks.
Optimization (CNPSO) for dynamic community detection. • Another point, is that the majority of Multi-objective
First, core nodes are obtained by computing the resistance optimization based approaches are time-consuming.
distance between every pair of nodes vi , vj as follows: Thus, even though they are competitive or even better
than many other frameworks in terms of accuracy, they
rij = (L−1 )ii + (L−1 )jj − 2(L−1 )ij , are not adequate for very large networks, which is a very
common issue in the big data era. Consequently, it would
such that L is the Laplacian matrix of the network graph G,
be important to take advantage of parallel aspects of the
and L−1 is the general inverse of L. Then, nodes associated
majority of multi-objective approaches in order to accel-
with these cores will construct the constant communities
erate the computation time.
according to the connections to other nodes in the network.
The constant community are considered as prior knowledge
from the previous iteration (time stamp) and is used for
inserting operations in the current iteration. To ensure the 8 Game model
quality of detected community structure at each iteration and
the smoothness of two consecutive time steps, a new updat- Game theory is an abstract mathematical model which
ing approach of particle parameters is proposed, which is describes many scenarios in which decision-making of
based on the optimal individual at the previous iteration, and players is influenced by the decisions of others. Every game
the population at the current time step. As in MOPSO-Net involves three main concepts: a set of players, a set of strate-
(Rahimi et al. 2018), the CSNPSO algorithm uses Kernel gies and utility function that quantifies the payoffs of each
k-means (KKM) and Ration Cut (RC) as objective functions. player depending on the strategies chosen by the players
In this subsection, we discussed some pertinent research involved in the game. Game theory can be an efficient mod-
works published in the context of community detection elization tool for many applications, such as multi-objective
using Metaheuristic based multi-objective optimization clustering (Badami et al. 2013), intrusion prevention in wire-
framework. Many other substantial contributions in this less network (Shamshirband et al. 2014), signal processing
context can be found in the following research references (Bacci et al. 2015), etc. The game theory model for commu-
(as schematized in Fig. 1): (Ebrahimi et al. 2018; Cheng nity detection focuses on three main issues (Jonnalagadda
et al. 2018; Zou et al. 2019; Yuanyuan and Xiyu 2018; Li and Kuppusamy 2016):
and Liu 2018; Žalik and Žalik 2018; Pattanayak et al. 2019;
Shahmoradi et al. 2019; Reihanian et al. 2023; Shen et al. • The formulation of payoff function (utility function),
2022; Shang et al. 2022; Zhu et al. 2018; Moradi and Parsa • The game framework used by the agent (nodes): coopera-
2019; Chen and Bi 2019; Zhang et al. 2020; Su et al. 2021; tive, non-cooperative,
Ma et al. 2021; Wan et al. 2020; Belli et al. 2020; Pourab- • The game theoretic solution concepts used to solve com-
basi et al. 2021) for the case of Evolutionary inspired multi- munity detection problem: Nash equilibrium in the case
objective optimization based approaches, Koc (2022) for the of cooperative game, or Shapley value and Core value in
case of Nature inspired multi-objective optimization based the case of cooperative games.
approaches, and Sun et al. (2018), Li et al. (2019), Trip-
athi et al. (2021), Wang et al. (2022) for the case of Swarm In the context of community detection in complex networks,
inspired multi-objective optimization based approaches. two main frameworks are generally used for community
detection: Non-cooperative game and Cooperative game.
Both of the two framework are discussed in the following
7.3 Concluding remarks and discussion
subsection. Furthermore, some relevant research works have
been described.
Table 6 summarizes a comparative study of relevant research
works relative to community detection approaches based on
Multi-Objective Optimization model.
93 Page 30 of 47 Social Network Analysis and Mining (2024) 14:93

8.1 Non‑cooperative game consists of an equilibrium of this game. They computed the


players’ utility by the combination of a gain function and a
Communities formation in complex networks can be mod- loss function. The gain function is based on the personal-
eled as Non-cooperative game Γ = (N, (Si )i∈N , (Ui )i∈N ) such ized modularity concept and a loss function that reflects the
that: N = {1, 2, … , n} is the set of selfish players, repre- intrinsic costs incurred when players change strategies by
senting the set of network’s nodes. Each player i ∈ N has joining some communities. Thus, given a strategy profile
a strategy space Si constituted of different strategies that s = (s1 , … , sn ) ∈ S , the personalized modularity function of
can play, where each played strategy defines its possible a player i ∈ N is given by the following equation:
belonging community(ies). Ui ∶ S = Πni=1 Si → ℝ is the util-
ity payoff function of the player i ∈ N , which computes the 1 ∑ dd
̂ j) − i j |Si ∩ Sj |),
Qi (s) = (Aij 𝛿(i,
expected income that the player i gets when a strategy profile 2m j∈N 2m
s ∈ Πni=1 Si is played by the whole players (each player has
played one of its strategies). where m is the number of edges in the network, Si is the strat-
The game is played among the nodes until an equilib- egy of the player i, 𝛿(i,
̂ j) = 1 if |Si ∩ Sj | ≥ 1 and 𝛿(i,
̂ j) = 0
rium is reached, generally Nash equilibrium is considered. otherwise.
The notion of Nash equilibrium, proposed by Nash in 1950
(Nash Jr 1950), is a situation such that no player has an
interest in deviating alone from the situation obtained. 8.1.2 Graphical games
Formally, s∗ = (s∗1 , … , s∗n ) ∈ S is a Nash equilibrium, if
∀ i ∈ N, ∀ s�i ∈ Si ∶ Ui (s∗i , s∗−i ) ≥ Ui (s�i , s∗−i ) , where s∗−i is a A Graphical game (N, (Si )i∈N , (Ui )i∈N ) is presented as a
strategy profile played by the other players other than i, i.e. graph G = (V, E), such that V = N and (i, j) ∈ E if the util-
s∗−i = (s∗1 , … , s∗i−1 , s∗i+1 , … , s∗n ). The strategy profile s∗ corre- ity function of the player i depends on the strategy chosen
sponding to this situation is a community structure solution by the player j and inversely. Thus, ∀i ∈ N , its utility func-
of the given network. tion is ui ∶ Πj∈(N(i)∪{i}) Sj → ℝ such that N(i) is the set of
In literature, there are mainly four types of approaches neighbors of i in the graph G. This model is appropriate
based on No-cooperative games for community detection for presenting games in which players influence directly
in complex networks: Potential games, Graphical games, the outcomes of only a subset of players in the game.
Iterative games, and Strategic games. Based on this framework, Narayanam and Narahari pro-
posed a novel graphical game theory inspired approach to
detect communities in social networks (Narayanam and
8.1.1 Potential games Narahari 2012) in a decentralized way. First, they set up
a community detection game with a utility function for
In this framework, for each player i ∈ N , a gain function each node that is fully dependent on its neighborhood. In
𝜙i ∶ S × Si → ℝ is defined to quantify the gain/loss of the other words, the utility function for each node is entirely
utility of the player i induced by changing alone its strat- reliant on local knowledge about that node’s egocentric
egy. Thus, a single potential function Φ is defined to quan- network. This utility function for a node i is described by
tify incentive of every player to change alone its strategy the following equation:
(its belonging community(ies)). Within this scope, given a
strategy profile s ∈ S , for every player i ∈ N and an alter- Ti (S)
ui (S) = di (S) + f (di (S)),
native strategy s�i ∈ Si: Cd2 (S)
i

𝜙i (S, s�i ) = Φ(s) − Φ(s−i , s�i ) = ui (s−i , s�i ) − ui (s). where di (S) is the number of neighbors of the node i in the
community S, Ti (S) is the number of pairs of neighbors of
In other words, if a player i ∈ N switches from its initial
the node i in S that are connected themselves, f(.) is a linear
strategy si to another strategy s�i ∈ Si in order to increase its
weight function, and Cd2 (S) is the 2-combination from di (S).
payoff, then the value of the potential function Φ decreases i

by the same amount (Monderer and Shapley 1996). The authors demonstrated that the suggested utility function
In this setting, Chen et al. (2010) introduced a Potential ensures the presence of a Nash equilibrium. Then, they pro-
game-theoretic framework to resolve the overlapping com- posed a decentralized algorithm based on this approach for
munity detection issue in social networks. They modeled detecting clusters in social networks, called Nash Stability
the dynamics of communities construction as a potential based Community Detection (NASHCoDe). First, an initial
game: each node in the network is considered as a self- partition of the graph in which nodes are clustered into
ish player who selects which communities to join or leave groups of three nodes is defined. After that, nodes are
according to a utility measurement. A community structure ordered in a non-decreasing sequence of their degree. Then,
Table 6  Comparative table of characteristics of complex network communities based on Multi-objective optimization model
Classification Works Criteria
Model Framework Approach Community type Links Distribution Complexity Scalability Application

Multi-objective Hierarchical Divisive Girvan and Newman Partitioning Undirected Centralized O(m2n ) Not considered Social circle detection
(2002)
Agglomerative Lalwani et al. (2015) Partitioning Undirected Distributed O(n log(n)) Considered Recommender system
Qie et al. (2022) Patitioning Undirected Distributed O(t(2n + m)), t : num- Considered Community detection in
ber of iterations complex networks
Social Network Analysis and Mining

Zhao et al. (2023) Patitioning Undirected Centralized – Not considered Dynamic community
detection derived from
the daily rhythms of
human mobility
Metaheuristic Evolutionary Malhotra (2021) Partitioning Undirected Parallelizable O(Ps .k.(n + e) + n.d), Considered Community detection in
Ps: population size, C: complex networks
(2024) 14:93

number of communi-
ties
Bello-Orgaz et al. Overlapping Undirected Parallelizable O(n) Considered Community detection in
(2018) complex networks
Reihanian et al. (2023) Overlapping Unidirected Parallelizable – Considered Social communities in
social networks with
node attributes
Shen et al. (2022) Partitioning Undirected Distributed – Considered Community detection in
complex networks
Shang et al. (2022) Overlapping Undirected Parallelizable O(n2 × PSno × PSno × t), Considered Community detection in
PSno ∶ population size complex networks
in non-overlaping
partitions, PSo ∶ popu-
lation size in overlap-
ping partitions, t :
number of iterations
Page 31 of 47
93
Table 6  (continued)
93

Classification Works Criteria


Model Framework Approach Community type Links Distribution Complexity Scalability Application

Nature inspired Zhang et al. (2020) Partitioning Undirected Parallelizable O(n × dim), dim: nodes’ Considered Community detection in
dimension complex
Page 32 of 47

Yang et al. (2022) Partitioning Undirected Parallelizable Three parts: Chemo- Considered Community detection in
taxis computation: complex networks
O(Nc Sn + Nc SNs m)
Reproduction
computation:
O(Nc Sm + Nc Sn)
Eliminaton-dispersal
computation:
O(Nc Nned n) Nc ∶
number or chromaticx
steps, Ns ∶ number
of bacteria, Nned ∶
number of elimination
dipersal
Koc (2022) Partitioning Undirected Parallelizable – Considered Community detection in
complex networks
Swarm inspired Rahimi et al. (2018) Partitioning Undirected Parallelizable – Considered Social circle detection
Tripathi et al. (2021) Partitioning Undirected Parallelizable – Considered Community detection in
complex networks
Sun et al. (2023) Partitioning Undirected Parallelizable O(maxgen × t × (m + n) × n, 2 )Considered Community detection
t : maximum number in dynamic complex
of iteration networks
Wang et al. (2022) Partitioning Undirected Parallelizable O(m + n) Considered Community detection in
complex networks

n: number of nodes in the network. D: average degree of nodes. K: number of clusters. m: number of edges, dim: dimension
Social Network Analysis and Mining
(2024) 14:93
Social Network Analysis and Mining (2024) 14:93 Page 33 of 47 93

the nodes are browsed iteratively, and if a node can improve 8.2.1 Super additive games
its utility by changing its group to another neighboring one,
the node is moved. The algorithm stops if there is no node A game Γ = (N, v) is Super additive if for every disjoint
that can improve its utility by changing its strategy. coalitions S, T ⊂ N ∶ v(S ∪ T) ≥ v(S) + v(T). In this context,
Zhou et al. (2013) proposed an approach that incorporates
8.1.3 Iterative games network topological structure information and individuals’
attribute information through Super additive cooperative
To detect overlapping communities in social networks, games. They introduced Super additive game to support stra-
Alvari et al. (2011) proposed two frameworks named tegic decision-making to recognize communities in dynamic
PSGAME and NGGAME based on Iterative game and struc- and heterogeneous social networks. In this cooperative game
ture equivalence to consider the formation of overlapping based approach, the Shapley value is used to assess users’
communities in social networks. Each node of the network is preferences and contributions to a certain coalition, as well
considered as an agent trying to form coalitions with mem- as the coalition’s relationship tightness. Then, they provided
bers whose structures are equivalent to it. The PSGAME an iterative algorithm for computing the Shapley value in
approach uses Pearson correlation into the game theory order to increase computation efficiency.
framework, whereas, NGGAME makes use of neighborhood
relation and adjacency relationships as similarity measure to
compute similarities between graphs in the game framework. 8.2.2 Partition function games

In Partition function games, the value of a coalition S does


8.1.4 Strategic games
not depend only on its members, but it has a strong depend-
ence on the way in which the players of the set N − S (the
Hajibagheri et al. (2012) proposed a new approach called
players not belonging to S) are structured, i.e. it depends on
Genetic Algorithm Diffusion Model (GADM) for detecting
the partition of N which is put in place at any time of the
communities in networks. This approach is based on Infor-
game.
mation diffusion model and Potential game. Thus, each node
Within this scope, Ayachi et al. (2021a), Ayachi et al.
of the graph is considered as a rational agent trying to maxi-
(2021b) studied the problem of Horizontal Cloud Feder-
mize its Shapley value based on the information it receives.
ation Formation (HCFF) in the case where a set of man-
The Nash equilibrium of the game reveals the community
ager clouds initiate simultaneously a formation of several
structure of the graph.
horizontal cloud federations with a set of subordinate
clouds without overlapping. They modeled the situation
8.2 Cooperative game as a Partition Function Game (PFG) without transferable
utility (C, V), where C = {C1 , … , Cn } is the cloud set,
Given interconnection graph G(V, E), a community struc- V ∶ (S, Π) → V(S, Π) is a function which assigns a value for
ture emergence can be modeled as a Cooperative game each coalition S, such that S ∈ Π, Π is the coalition structure
(Coalitional game) Γ = (N, v) in which N is a set of players, of C. V(S, Π) = (𝜙1 (S, Π), … , 𝜙|S| (S, Π)), and 𝜙i (S, Π) is the
known as the grand coalition, that represents the nodes of gain of the ith individual of the federation.
the graph, v is a characteristic function from the set of all To compute the gain function 𝜙i (S, Π) of the ith indi-
possible coalitions of players 2|N| to their payoffs. Depend- vidual, let consider the production costs pCi (S) of virtual
ing on the nature of the function v, two types of games are machines provided by Ci to S, the cost vi (S) paid by Ci to sat-
defined: cooperative game with transferable utility (TU) and isfy the client request, and revi (S), the revenue of Ci obtained
cooperative game with non-transferable utility (NTU). The from the federation S. Then:
game played is a TU game, if v is a valued function in ℝ, i.e. {
one player can transfer a part of its payoff to another player revi (S) − vi (S) if i is a manager cloud,
Φi (S, Π) =
from the same coalition. In a NTU-game, the function v vi − pCi (S) otherwise.
assigns to each coalition a feasible set of payoff vectors.
The game continues until no player changes coalitions and
the reached state reveals the community structure to adopt. 8.2.3 Influence games
In literature, we distinguish three main classes of
approaches that are based on Cooperative games for commu- Let G = (V, E) be a graph. The Influence game of G is a cou-
nity detection in complex networks: Super additive games, ple (G, vG ), where vG is a characteristic function that assigns
Partition function games, and Influence games. to every coalition S ⊂ V a number vG ∶ 2N ⟶ ℝ, such that
93 Page 34 of 47 Social Network Analysis and Mining (2024) 14:93

vG (S) is the sphere of influence of nodes of S (Szczepański 9 Classification proposal of community


et al. 2015). applications in complex information
In their framework called Cooperative Game Theory- networks
Based Algorithm for Overlapping Community Detection
(CGTA), Zhou et al. (2020) considered the process of com- As explained in the previous sections, up to now, there is
munity detection as an influence game model, in which each no unique formal consensual definition of a community in
node is acting as a player and communities as game coali- real complex information networks. This is mainly due to
tions. In addition to that, based on cooperative game theory, its wide range of applications. Indeed, community detec-
an overlapping community detection is provided. Further- tion is used in many information network applications such
more, an edge weight calculation for computing Shapley as depicting common features among nodes, optimizing
value of nodes and coalition is proposed, and an optimal time and memory complexities in big data environment.
income strategy for every player is described. The different In this section, we review the five common community
coalitions may be merged to prevent small size communities applications in complex information networks found in
by selecting coalitions that can increase the value of their literature: Collaborative filtering, Friend recommenda-
characteristic function. This approach does not require any tion, Sybil defense, Web community, Service composition
prior knowledge about the number or size of communities, communities.
and the overlapping community structure is stable at the end As result, Table 8 and Fig. 4, depict the distribution of
of the algorithm. most relevant works published mostly in the last decade
about community detection in complex information network.
8.3 Concluding remarks and discussion
9.1 Collaborative filtering
Table 7 is a comparative study of some research works about
community detection approaches based on Game theory In collaborative filtering, we consider a set of n users
model. U = {u1 , u2 , … , un }, a set I of m items I = {i1 , … , im }, and
The comparative study shown in Table 7 reveals the fol- for each user ui ∈ U , a subset Iui ⊂ I of items which the user
lowing points: ui has already rated. Given an active user ua ∈ U , the aim of
collaborative filtering is to find items with highest utility
• In literature, most research works about community argmaxR(ua , ik ) to the user, where R(ua , ik ) is the rating of
detection using game theoretic methods are limited to ik ∈I

small-scale networks (mainly social graphs). However, the user ua of the item ik . Let G(V, E) be a graph representing
the majority of real complex information networks are trust network between users, where V = U , and
very large, and many available game theoretic algorithms ∀u1 , u2 ∈ V, (u1 , u2 ) ∈ E if u1 trusts u2. For every active user
have high time complexity and cannot be good solutions ua ∈ V , we denote by TC(ua ), a subgraph of G(V, E), as the
for community detection in real-world graphs databases trust community of ua . Based on the fact that users tends to
in big data environments. Hence, it would be interesting rely on their friends’ recommendations, Collaborative filter-
to study how for instance, the parallelism and the distrib- ing approaches examine the ratings of the users belonging
uted computing can improve and enhance things in this to the targeted user ua social/trust communities TC(ua ) ⊂ U
setting. in order to predict ua ’s preferences over the given set of
• As in the case of graph model, the majority of research items I, and then generate high quality recommendations.
works about community detection based on game theo- Within this scope, Lee and Ma (2016) presented a hybrid
retic model are limited to the undirected networks case. approach of collaborative filtering which is based on both
However, assuming the directed network as undirected user preferences and trust information. User preferences
induces loss of graph relational information. This could consist of co-rating similarities between users which are
induce node classification errors. Thus, it would be inter- computed using Jaccard coefficient and its variants (Al-Oufi
esting if hybrid game theoretic models could be devel- et al. 2012). On the other hand, trust information means
oped to detect both edge density and feature similarity direct or indirectly inferred trust relationships between users.
communities in the directed network case. Moreover, the approach relies on both trust and distrust links
and studies their propagation effects on recommendation
accuracy.
For further investigation on community detection applied
to Collaborative filtering applications, Table 9 classifies rel-
evant research works published since 2010. This classifica-
tion is established according to the models and frameworks
Social Network Analysis and Mining (2024) 14:93 Page 35 of 47 93

Table 7  Comparative table of characteristics of complex network communities based on Game theoretic model
Classification Works Criteria
Model Framework Approach Community Links Distribution Complexity Scalability Application
type

Game Non-cooper- Potential Chen et al. Overlapping Undirected Parallelizable O(S), S: Considered Social circle
ative (2010) number of detection
all possible
strategies
Iterative Alvari et al. Overlapping Undirected Parallelizable O(S), S: Considered Social circle
(2011) number of detection
all possible
strategies
Strategic Hajibagheri Partitioning Undirected Parallelizable O(tn + Cnd), Considered Social circle
et al. (2012) t: number detection
of runs,
C: number
player
selection
Cooperative Super addi- Zhou et al. Overlapping Undirected Parallelizable O(mn4 ) Considered Social circle
tive (2013) detection
Partition Ayachi et al. Partitioning Undirected Distributed – Considered Coalition
function (2021a, of virtual
2021b) machines
of fed-
eration and
subordi-
nate clouds
Influence Szczepański Overlapping Undirected Parallelizable O(n2 + nm) Considered Social circle
et al. (2015) detection;
Friend
recom-
mender
Zhou et al. Overlapping Undirected Parallelizable O(n) Considered Social circle
(2020) detection;
Friend
recom-
mender

n: number of nodes in the network. D: average degree of nodes. K: number of clusters. m: number of edges, dim: dimension

that define our proposed taxonomy described in Fig. 1 of friendship) between users. Let vi , vj ∈ V , (vi , vj ) ∈ E means
Sect. 4. Note that all the empty frameworks and models have that vi and vj are friends. Complex online social networks are
been left out from the table. generally sparse structures which means that an online social
What stands out in Table 9 is that Game model of com- network usually represents a small part of user’s potential
munities is scarcely used in the context of Collaborative online social circles. The friend recommendation problem
filtering applications. Furthermore, it shows the predomi- can be formalized as a probabilistic model. Indeed, given
nance of Unsupervised learning approaches over the other an active user va ∈ V , and a set N(v ̄ a ) of non-adjacent users
frameworks. Also, the table reveals that Multi-objective opti- of va , i.e. ∀vj ∈ N(va ) ∶ (va , vj ) ∉ E . The aim is to evaluate
̄
mization models of communities are more frequently used the probability that a friendship between a user vj ∈ N(v ̄ a)
in Collaborative filtering than Graph model. and va will be formed and eventually to recommend vj as
a potential friend to va . Based on user vj ’ s reputation in
9.2 Friend recommendation the social circle of va and the relations in the considered
social network, detection of dense social communitie(s) of
Let G(V, E) be the graph describing an online social network va would help to assess the probability of a future friendship
where V is the vertex set that represents the users and E ⊂ V 2 relation even though there have not been any prior interac-
is the set of edges that denote social relationships (trust, tion between them.
93 Page 36 of 47 Social Network Analysis and Mining (2024) 14:93

In this context, Akbari et al. (2013) proposed a friend using Adamic-Adar metric (Adamic and Adar 2003) which
recommendation system based on Artificial Bee Colony is based on common neighbors between any given pair of
(ABC) which recommends new friends among users within nodes. Indeed, generally, true nodes tend to share more
the social network. The approach relies on some topologi- friends with other true nodes than with Sybil nodes. In the
cal information of communities in social networks to pro- second step, the Louvain method (Blondel et al. 2008) is
pose relevant features for relationship in the network. These applied to detect social graph communities. Then, the result-
topological information are: the egocentred social circle of ing community structure is fed to Within-Inter-Community
the targeted user, the number of shared friends, the density (WIC) metric in order to refine the similarities obtained from
of links between the set formed by all the neighbors of the the first step. This is ended by tuning of the nodes similarity
targeted user and the candidate user, and finally, the number values which are greater than 1. Next, the resulting similar-
of common friendship circles to which both the targeted user ity values are assigned to the social graph edges as their
and the candidate user belong to. weights. Finally, in the third step, a Modified Short Random
For further reading on community detection applied to Walk is applied on the weighted social graph, the steady state
Friend recommendation, Table 10 gives a summarizing clas- probabilities of the random walk represent the posterior trust
sification of pertinent research works published since 2010 information to use in order to perform ranking of nodes.
according to models and frameworks described in our pro- For further investigation on community detection applied
posed taxonomy (see Fig. 1 of Sect. 4). All the frameworks to Sybil defense, Table 11 gives a classification of pertinent
and models with no published works have been left out from research works published since 2010 according to the used
the table. models and frameworks of communities.
It is apparent from Table 10 that few frameworks of com- From Table 11, we can see that the majority of research
munity detection have been used in the context of Friend works using community detection for Sybil defense fall
recommendation application. Indeed, Game theory model under Stochastic framework of Graph model. Whereas
has not been used. Also, the majority of works applying other fewer works use either Unsupervised and Supervised
communities on Friend recommendation are based on frameworks of Machine learning model, or Hierarchical
Unsupervised learning framework. However, fewer works and Meta-heuristic frameworks of Multi-objective model.
have used Stochastic framework of Graph model and Meta- However, publications on Sybil defense based on the other
heuristic framework of Multi-objective optimization model. community detection frameworks (namely Pattern detection
and Spectral clustering of Graph model, Semi-supervised
9.3 Sybil defense learning of Machine learning model, and Cooperative and
Non-cooperative games of Game thoery model) remain very
Let G(V, E) be a graph representing a complex network, rare.
where V is the vertex set that denotes the different interacting
entities and E ⊆ V 2 the edge set that figures any pear-to-pear 9.4 Web communities
relationship (friendship, trust, etc.) between nodes. Let A be
the (n × n)-adjacency matrix of G(V, E), where n = |V|. We Let G(V, E) be a directed graph representing a subnetwork
consider C ⊂ V as a set of malicious nodes that emulate the of the Word Wide Web (WWW) network, where V is the
behaviors of real nodes in order to form an adversary subvert vertex set which represents the set of web pages in the
network. Generally, Sybil nodes tend to form edges between network, and E ⊂ V 2 is the edge set where ∀vi , vj ∈ V , if
each others in order to merge in the network and gain influ- (vi , vj ) ∈ E then the page vi cites (refers to) the page vj .
ence in it. On the other hand, an edge between a true node These citations can be tags, hyperlinks, etc. The notion
and a Sybil node is generally harder to form, as the trust of Community can be used in the context of WWW to
value between them is generally low. Thus, the search of the enhance Web searching results. Indeed, the structuring of
Sybil nodes set C ⊂ V consists in finding a bottleneck cut Φ research query results into clusters (communities) can be a
in the graph, which∑is a cut (C, V − C) that has a minimum suitable way to enhance page retrieval accuracy in the web.
Aij Let WP = {WP1 , … , WPm }, WPi ⊆ V, ∀ i ∈ {1, … , k} be
vi ∈C,vj ∉C
density argmin . This problem can be solved by a community structure of G, where a community can be
C �C��V − C� designed either by the topological structure of the WWW
different graph-based approaches for community detection. network or the semantic similarity of pages’ contents.
Mulamba et al. (2016) proposed SybilRadar, a Sybil In the case where topological structure of the network
attacks detection framework based on structural proper- is considered, the approaches rely on the principle that rel-
ties of online social networks and community structures. evant and related web pages are generally located close to
SybilRadar consists of three steps: In the first step, simi- each other in the hyperlinks graph. Thus, their aim consists
larity values between a given pair of nodes are computed on depiction highly cohesive structures using community
Social Network Analysis and Mining

Table 8  Classification of research works relative to the applications of communities in complex information networks
Community applications in complex information networks
Collaborative filtering Friend recommendation Sybil defense Web community Service composition
(2024) 14:93

Alqadah et al. (2015), Azad- Akbari et al. (2013), Bagci and Ahmed and Abulaish (2013), Abualigah et al. (2016), Bandari Cherifi et al. (2013), Chhun et al.
jalal et al. (2017), Bellogin and Karagoz (2016), Chakrabarty et al. Boshmaf et al. (2013), Boshmaf et al. (2019), Bharti and Raval (2015), Khanouche et al. (2019),
Parapar (2012), Cai et al. (2014), (2019), Cui et al. (2018), Saman- et al. (2016), Cai and Jermaine (2019), Cobos et al. (2014), Di Klein et al. (2012), Lei and Philip
Cañamares and Castells (2017), thula and Jiang (2015), Wang et al. (2012), Cao et al. (2012), Chang Marco and Navigli (2013), Forsati (2019), Li et al. (2017), Li and He
Casino et al. (2015), Chen et al. (2015), Wang et al. (2018), Xu and et al. (2013), Danezis and Mittal et al. (2015), He et al. (2014), (2014), Pan and Chai (2018), Shang
(2015), Fletcher and Liu (2015), Yang (2015), Zhang and Li (2011), (2009), Huang et al. (2013), Jia Kanavos et al. (2019), Katarya et al. (2013), Wen et al. (2019),
Hernando et al. (2016), Hu et al. Zhang et al. (2015), Zheng et al. et al. (2017), Liu et al. (2014), Ma and Verma (2017), Sisodia et al. Zhang et al. (2016), Jalal et al.
(2014), Jia et al. (2015), Kant (2015), Chang et al. (2022) et al. (2014), Misra et al. (2016), (2017), Tiwari et al. (2019), Tseng (2023), Chang et al. (2021), Smahi
and Mahara (2018), Saranya and Mohaisen et al. (2011), Mulamba et al. (2014), Yan et al. (2017) et al. (2021), Nacer et al. (2017)
Sadasivam (2017), Koohi and et al. (2016), Nilizadeh et al.
Kiani (2016), Koohi and Kiani (2017), Ramalingam et al. (2017),
(2017), Lee and Ma (2016), Li Shi et al. (2013), Stringhini et al.
et al. (2013), Li et al. (2017), Patra (2015), Tan et al. (2013), Wei et al.
et al. (2015), Pham et al. (2011), (2012), Xiao et al. (2015), Xue
Pirasteh et al. (2015), Polatidis et al. (2013), Yang et al. (2016),
and Georgiadis (2017), Wu et al. Yu et al. (2008)
(2013), Xiaojun (2017), Yu and
Huang (2016), Zheng et al. (2011),
Zheng et al. (2015), Jiang et al.
(2022), Paleti et al. (2021), Chen
et al. (2021)
Page 37 of 47
93
93 Page 38 of 47 Social Network Analysis and Mining (2024) 14:93

order to maximize intra-cluster similarities and minimize


inter-cluster similarities.
To enhance the search keyword-based engines capabili-
ties, Tseng et al. (2014) proposed an algorithm based on a
variation of k-mean algorithm called Keen-means in order
to cluster Web resources in advance. On the other hand, Di
Marco and Navigli (2013) presented an approach for Web
search result clustering which handles the issue of language
ambiguity. The objective is to enhance the relevance of the
retrieved list of results of a query by using a clustering algo-
rithm. Thus, the set of search results are preprocessed. Then,
Fig. 4  Distribution of literature relative to applications of community using Word Sense Induction (WSI) which automatically dis-
detection in complex information networks covers the meanings of any given query, the semantics are
injected into the query results. Thence, each query sense is
represented as a cluster of words that co-occur in raw text
detection approaches. Consequently, a clustering solution with the query. Finally, each search result returned by a Web
WP = {WP1 , … , WPk } s.t. WPi ⊂ V , is generally obtained search is mapped to the most pertinent cluster and the result-
by optimizing some objective functions, max f1 (WPi , WPi ) ing clustering of results is returned.
and min f2 (WPi , WPi ), WPi = WP − WPi , that quantify the For further reading on the subject of Web communities,
links densities between communities. Table 12 classifies additional relevant references published
On the other hand, in the case semantic similarities are in this scope since 2010. The classification is established
revealed, the approaches used are knowledge-oriented according to the models and frameworks constituting our
which consist on extracting hidden patterns and fea- taxonomy described in Fig. 1 of Sect. 4. Note that the table
tures and using clustering approaches in order to extract does not contain frameworks and models with no published
query relevant information. In this context, let consider work.
WP = {WP1 , … , WPk } s.t. WPi ⊂ V as a clustering solution It can be seen from Table 12 that the majority of publi-
on the set of pages V = {v1 , … , vn } such that, every node cations about Web communities use Unsupervised machine
vi , ∀ i ∈ {1, … , n} is considered as n−vector of web page learning framework. On the other hand, relatively less
features ( vi ∈ ℝ). In this case, the aim is to optimize objec- research works are based on Multi-objective model (Hierar-
tive functions that quantify similarities between pages in chical and Mata-heuristic frameworks). In addition to that,
it stands up in the table that Graph model is scarcely used in

Table 9  Classification of research works relative to community detection in the context of collaborative filtering according to the models and
frameworks of the used communities
Models Frameworks Research works

Graph model Patterns detection Lee and Ma (2016)


Spectral clustering Bellogin and Parapar (2012)
Machine learning model Unsupervised learning Alqadah et al. (2015), Casino et al. (2015), Hernando et al. (2016),
Kant and Mahara (2018), Koohi and Kiani (2016), Li et al. (2017),
Xiaojun (2017), Jiang et al. (2022)
Supervised learning Cañamares and Castells (2017), Patra et al. (2015)
Multi-objective optimization model Hierarchical Hu et al. (2014), Pham et al. (2011), Paleti et al. (2021)
Metaheuristic Chen et al. (2021), Chen et al. (2015)

Table 10  Classification Models Frameworks Research works


of research works relative
to community detection Graph model Stochastic Bagci and Karagoz (2016), Cui et al. (2018)
in the context of friend
Machine learning Unsupervised Wang et al. (2015), Wang et al. (2018), Xu and
recommendation according to
Yang (2015), Zheng et al. (2015), Chang et al.
the models and frameworks of
(2022)
the used communities
Multi-objective optimization Metaheuristic Akbari et al. (2013)
Social Network Analysis and Mining (2024) 14:93 Page 39 of 47 93

Table 11  Classification of research works relative to community detection in the context of Sybil defense according to the models and frame-
works of the used communities
Models Frameworks Research works

Graph model Stochastic Danezis and Mittal (2009), Boshmaf et al. (2013), Boshmaf et al.
(2016), Cai and Jermaine (2012), Cao et al. (2012), Chang et al.
(2013), Huang et al. (2013), Jia et al. (2017), Mohaisen et al. (2011),
Mulamba et al. (2016), Wei et al. (2012), Xue et al. (2013), Yu et al.
(2008), Shi et al. (2013)
Machine learning Unsupervised Nilizadeh et al. (2017), Tan et al. (2013)
Supervised Ramalingam et al. (2017), Xiao et al. (2015)
Multi-objective optimization Hierarchical Misra et al. (2016)
Metaheuristic Ahmed and Abulaish (2013), Ahmed and Abulaish (2013)

Table 12  Classification of Models Frameworks Research works


research works relative to Web
communities according to the Graph model Stochastic Yan et al. (2017)
models and frameworks of the
Machine learning model Unsupervised Bandari et al. (2019), Forsati et al. (2015), He et al. (2014),
used communities
Sisodia et al. (2017), Tiwari et al. (2019), Tseng et al.
(2014)
Supervised Bharti and Raval (2019), Katarya and Verma (2017)
Multi-objective model Hierarchical Di Marco and Navigli (2013)
Metaheuristic Abualigah et al. (2016), Cobos et al. (2014)

Web communities. While no substantial work in this scope reliable collaborative trust authentication for arbitrary Web
falls under Game theory model. composite serivce, it processes byond domain authentication
boundaries, it handles security policies conflicts, finally, the
9.5 Web service composition communities model is scalable and dynamic.
More research works on web service composition com-
Let WSC = {ws1 , … , wsn } be a composition of n web ser- munities are classified in Table 13 according to the frame-
vices in the cloud. There are mainly two aspects in which work and models which describe the taxonomy we proposed
community notion appears in the context of web composi- in Fig. 1.
tion: Trust and Quality of Service (QoS). From Table 13, we can see that only few community
Let TC = {tc1 , … , tcm } be a set of trust circles, where detection frameworks have been used in the context of
each of them is regulated by a certification authority. Every Service composition application. These consist mainly of
web service swi ∈ WSC, i ∈ {1, … , n} belongs to a non Meta-heuristic Multi-objective optimization and Unsuper-
empty set of Trust circles TC(swi ) ⊂ TC . The composition vised learning framework. Furthermore, fewer works based
of web services WSC should ensure the Trust policy between on Hierarchical Multi-objective optimization and Stochastic
its web services which can belong to different Trust circles. framework have been published. Moreover, the other frame-
On the other hand, given a set of web services works have been scarcely investigated in this context.
WS = {ws1 , … , wsN } , the problem of QoS-based web
service composition consists on finding a synthetized
composition profile CP = {cp1 , … , cpm } such that: 10 Discussion, open issues and future
cpi ⊂ WS, ∀ i ∈ {1, … , m} is a cluster of service composition research directions
which satisfies some predefined QoS requirements.
Consequently, Service composition problem can be mod- After a thorough examination of existing models and tech-
eled as a problem of detecting communities of Trust and niques for community detection in complex networks used
QoS-based clusters. in literature and a study of the different applications of com-
In this context, Nacer et al. (2017) proposed a dynamic munities, we discuss in this section the main open issues
fully distributed model of authentication for composite and prospective research directions that would be worth and
Web service which is based on trust circle concept. The interesting to be dealt with.
proposed model has four main functionalities: it ensures
93 Page 40 of 47 Social Network Analysis and Mining (2024) 14:93

Table 13  Classification of research works relative to community detection in the context of service composition according to the models and
frameworks of the used communities
Models Framework Research works

Graph model Stochastic Chang et al. (2021)


Machine learning model Unsupervised learning Lei and Philip (2019), Jalal et al. (2023), Smahi et al.
(2021)
Multi-objective optimization model Hierarchical Shang et al. (2013), Wen et al. (2019)
Metaheuristic Chhun et al. (2015), Khanouche et al. (2019), Klein et al.
(2012), Pan and Chai (2018)

• Community detection is widely used in collaborative fil- these promising results in security fields, questions remain
tering and social collaborative filtering. Indeed, as shown about the applicability of these approaches in other security
in Fig. 4, more than 30% of relevant research works pub- attacks detection mechanisms such as Intrusion Detection
lished mainly in the last decade on the context of applica- Systems (IDS).
tions of network communities are related to collaborative • Multi-objective optimization has been widely used in
filtering. However, the used communities are generally literature for community detection in complex networks
limited to the Ego-centered networks around a given in theoretical perspectives. However, there is much less
node (adjacent neighbors, or K-nearest neighbors). As research works which use Multi-objective optimization for
consequence, it would be worth to lead more investiga- detecting community structures in real network applica-
tions on the applicability of more accurate community tions. Thus, using Multi-objective optimization based com-
detection algorithms to formally find more optimized munity detection algorithms to enhance networks’ applica-
coarse-grained clusters. This would enhance the time tion performance would be of great interest to the scientific
performance of the system without altering the accuracy community.
dimension, as the information of correlated non-adjacent • Another important open issue is the detection of dynamic
nodes of same communities are considered. communities in graph data streaming, which is very com-
• Even though many improvements have been made in the mon in real online social networks. In this case, large graph
context of Collaborative Filtering based on community data is received as a continuous edge stream. This issue is
detection, further studies should be undertaken. More spe- more challenging, because an edge in the graph cannot be
cifically, in the context of the scalability and massive data processed more than once during the computation process.
processing by systems generally with limited the storage In such cases, summary structures should be designed in
and computational resources. Thus, to enhance process- order to facilitate an effective community detection pro-
ing capacity and reduce the computing time, it would be cess.
interesting to develop parallel algorithms for distributed
computing systems.
• There has been a growing community detection methods 11 Conclusion
for detecting Sybil attacks. Indeed, as shown in Fig. 4,
more than a quarter (> 25% ) of main research works pub- Community detection serves as a backbone for informa-
lished in the context of community networks applications tion mining. The present article aims to extensively sur-
concerns Sybil attacks detection. However, these methods vey relevant and recent research works about community
are mainly based on graph theory methods which gener- detection with a new angle: classifying relevant research
ally perform poorly when dealing with real-world attacks. works relative to community detection according to the
Indeed, in most graph based detection techniques, to detect mathematical model used: Graph model, Machine learn-
Sybil clusters, it is assumed that the whole topology of the ing, Multi-objective, and Game theory. In this setting,
network is known and the detection of Sybil attacks is a this work highlights the importance of the community
centralized task. Thus, it would be interesting to use online detection in real network applications and data mining in
Sybil detection methods that are based on decentralized information networks fields such as Recommender sys-
approaches with local incomplete topological information tems, Sybil attack defense, Friend recommendations, Web
view. clustering and Service composition. The importance and
• Graph community models are widely used in Sybil and originality of this study is that it explores both the existing
Shelling attacks detection and it has been shown that they community detection frameworks and their applicability in
outperform many other content-based approaches. Despite the analysis of real graph databases in order to explore and
Social Network Analysis and Mining (2024) 14:93 Page 41 of 47 93

elaborate new strategies to optimally solve many network Al-sharoa E, Rahahleh B (2023) Community detection in networks
application issues in real big data ecosystems. Indeed, through a deep robust auto-encoder nonnegative matrix fac-
torization. Eng Appl Artif Intell 118:105657
even though it has been proven that community detection Alvari H, Hashemi S, Hamzeh A (2011) Detecting overlapping
can be an effective way to enhance the accuracy and per- communities in social networks by game theory and structural
formance of many network applications, questions remain equivalence concept. In: International conference on artificial
about their scalability and their effectiveness in depleting intelligence and computational intelligence, pp. 620–630.
Springer
communities in continuously growing dynamic complex Amelio A, Pizzuti C (2015) Is normalized mutual information a fair
information networks. Finally, this survey can be consid- measure for comparing community detection methods? In:
ered as a starting point for researchers interested in dealing Proceedings of the 2015 IEEE/ACM international conference
with community detection issue from both theoretical and on advances in social networks analysis and mining 2015, pp.
1584–1585
application point of view in order to design more accu- AMI FL-M (1972) On the decomposition of networks into minimally
rate approaches for community detection to extract useful interconnected subnetworks. IEEE transactions on Circuit The-
structural information to enhance the performance of real ory, CT-16 2
network applications. Amiri B, Hossain L, Crawford JW, Wigand RT (2013) Community
detection in complex networks: Multi-objective enhanced firefly
Acknowledgements The authors are grateful to the anonymous refer- algorithm. Knowl-Based Syst 46:1–11
ees for their valuable suggestions and comments which have helped to Andersen R, Lang KJ (2006) Communities from seed sets. In: Proceed-
improve the quality of the paper and its presentation. ings of the 15th international conference on world wide web,
pp. 223–232
Author contributions All authors have seen and approved the manu- Ayachi M, Nacer H, Slimani H (2021) Cooperative game approach to
script, and contributed significantly to the work. form overlapping cloud federation based on inter-cloud architec-
ture. Clust Comput 24(2):1551–1577
Ayachi M, Nacer H, Slimani H (2021) Correction to: cooperative game
Declarations approach to form overlapping cloud federation based on inter-
cloud architecture. Clust Comput 24(2):1579–1582
Competing interests The authors declare no competing interests. Azadjalal MM, Moradi P, Abdollahpouri A, Jalili M (2017) A trust-
aware recommendation method based on Pareto dominance and
confidence concepts. Knowl-Based Syst 116:130–143
Bacci G, Lasaulce S, Saad W, Sanguinetti L (2015) Game theory for
networks: a tutorial on game-theoretic tools for emerging signal
References processing applications. IEEE Signal Process Mag 33(1):94–119
Badami M, Hamzeh A, Hashemi S (2013) An enriched game-theoretic
Abualigah LM, Khader AT, Al-Betar MA, Awadallah MA (2016) A framework for multi-objective clustering. Appl Soft Comput
krill herd algorithm for efficient text documents clustering. In: 13(4):1853–1868
Computer applications & industrial electronics (ISCAIE), 2016 Bagci H, Karagoz P (2016) Context-aware friend recommendation for
IEEE Symposium On, pp. 67–72. IEEE location based social networks using random walk. In: Proceed-
Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc ings of the 25th International Conference Companion on World
Netw 25(3):211–230 Wide Web, pp. 531–536. International World Wide Web Confer-
Ahmed F, Abulaish M (2013) Identification of sybil communities gen- ences Steering Committee
erating context-aware spam on online social networks. In: Asia- Bandari D, Xiang S, Martin J, Leskovec J (2019) Categorizing user
Pacific Web Conference, Springer. pp. 268–279 sessions at pinterest. In: 2019 IEEE International Conference
Akbari F, Tajfar AH, Nejad AF (2013) Graph-based friend recommen- on Big Data and Smart Computing (BigComp), pp. 1–8. IEEE
dation in social networks using artificial bee colony. In: Depend- Belli D, Chessa S, Foschini L, Girolami M (2020) The rhythm of
able, autonomic and secure computing (DASC), 2013 IEEE 11th the crowd: Properties of evolutionary community detection
International Conference On, pp. 464–468. IEEE algorithms for mobile edge selection. Pervasive Mob Comput
Al-Andoli MN, Tan SC, Cheah WP (2022) Distributed parallel deep 67:101231
learning with a hybrid backpropagation-particle swarm optimi- Bellogin A, Parapar J (2012) Using graph partitioning techniques for
zation for community detection in large complex networks. Inf neighbour selection in user-based collaborative filtering. In: Pro-
Sci 600:94–117 ceedings of the Sixth ACM conference on recommender systems,
Ali M, Hassan M, Kifayat K, Kim JY, Hakak S, Khan MK (2023) pp. 213–216. ACM
Social media content classification and community detection Bello-Orgaz G, Salcedo-Sanz S, Camacho D (2018) A multi-objective
using deep learning and graph analytics. Technol Forec Soc genetic algorithm for overlapping community detection based on
Chang 188:122252 edge encoding. Inf Sci 462:290–314
Al-Oufi S, Kim H-N, El Saddik A (2012) A group trust metric for Bharti PM, Raval TJ (2019) Improving web page access prediction
identifying people of trust in online social networks. Expert using web usage mining and web content mining. In: 2019 3rd
Syst Appl 39(18):13173–13181 international conference on electronics, communication and aero-
Alpert CJ, Kahng AB, Yao S-Z (1999) Spectral partitioning with space technology (ICECA), pp. 1268–1273. IEEE
multiple eigenvectors. Discret Appl Math 90(1–3):3–26 Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast
Alqadah F, Reddy CK, Hu J, Alqadah HF (2015) Biclustering neigh- unfolding of communities in large networks. J Stat Mech Theory
borhood-based collaborative filtering method for top-n recom- Exp 2008(10):10008
mender systems. Knowl Inf Syst 44(2):475–491 Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang D-U (2006)
Complex networks: structure and dynamics. Phys Rep
424(4–5):175–308
93 Page 42 of 47 Social Network Analysis and Mining (2024) 14:93

Boshmaf Y, Logothetis D, Siganos G, Lería J, Lorenzo J, Ripeanu Chen C, Zhu W, Peng B (2022) Differentiated graph regularized non-
M, Beznosov K, Halawa H (2016) Íntegro: leveraging victim negative matrix factorization for semi-supervised community
prediction for robust fake account detection in large scale OSNs. detection. Phys A 604:127692
Comput Secur 61:142–168 Cheng F, Cui T, Su Y, Niu Y, Zhang X (2018) A local information
Boshmaf Y, Beznosov K, Ripeanu M (2013) Graph-based Sybil Detec- based multi-objective evolutionary algorithm for community
tion in social and information systems. In: 2013 IEEE/ACM detection in complex networks. Appl Soft Comput 69:357–367
international conference on advances in social networks analysis Chen M, Wei Z, Huang Z, Ding B, Li Y (2020) Simple and deep graph
and mining (ASONAM 2013) convolutional networks. In: International conference on machine
Bouyer A, Roghani H (2020) Lsmd: a fast and robust local commu- learning, pp. 1725–1735. PMLR
nity detection starting from low degree nodes in social networks. Cherifi C, Rivierre Y, Santucci J-F (2013) A community based algo-
Futur Gener Comput Syst 113:41–57 rithm for large scale web service composition. arXiv preprint
Cai Y, Leung H-F, Li Q, Min H, Tang J, Li J (2014) Typicality-based arXiv:​1305.​0187
collaborative filtering recommendation. IEEE Trans Knowl Data Chhun S, Malang K, Cherifi C, Moalla N, Ouzrout Y (2015) A web
Eng 26(3):766–779 service composition framework based on centrality and commu-
Cai B, Wang Y, Zeng L, Hu Y, Li H (2020) Edge classification based nity structure. In: 2015 11th International Conference on Signal-
on convolutional neural networks for community detection in Image Technology Internet-Based Systems (SITIS), pp. 489–496
complex network. Phys A 556:124826 Cobos C, Muñoz-Collazos H, Urbano-Muñoz R, Mendoza M, León E,
Cai B, Wang M, Chen Y, Hu Y, Liu M (2022) Mff-net: a multi-feature Herrera-Viedma E (2014) Clustering of web search results based
fusion network for community detection in complex network. on the cuckoo search algorithm and balanced Bayesian informa-
Knowl-Based Syst 252:109408 tion criterion. Inf Sci 281:248–264
Cai Z, Jermaine C (2012) The latent community model for detecting Contisciani M, Battiston F, De Bacco C (2022) Inference of hyper-
sybil attacks in social networks. In: Proc NDSS edges and overlapping communities in hypergraphs. Nat Com-
Cañamares R, Castells P (2017) A probabilistic reformulation of mun 13(1):7229
memory-based collaborative filtering: implications on popular- Costa AR, Ralha CG (2023) Ac2cd: an actor-critic architecture for
ity biases. In: Proceedings of the 40th international ACM SIGIR community detection in dynamic social networks. Knowl-Based
conference on research and development in information retrieval, Syst 261:110202
pp. 215–224. ACM Cui L, Wu J, Pi D, Zhang P, Kennedy P (2018) Dual Implicit Mining-
Cao J, Jin D, Yang L, Dang J (2018) Incorporating network structure Based Latent Friend Recommendation. IEEE Trans Syst Man
with node contents for community detection on large networks Cybern Syst 50:1663
using deep learning. Neurocomputing 297:71–81 Danezis G, Mittal P (2009) SybilInfer: detecting Sybil nodes using
Cao Q, Sirivianos M, Yang X, Pregueiro T (2012) Aiding the detection social networks. In: NDSS, pp. 1–15. San Diego, CA
of fake accounts in large scale social online services. In: Proceed- Das S, Biswas A, Dasgupta S, Abraham A (2009) Bacterial foraging
ings of the 9th USENIX conference on networked systems design optimization algorithm: theoretical foundations, analysis, and
and implementation, p. 15. USENIX Association applications. Foundations of computational intelligence volume
Casino F, Domingo-Ferrer J, Patsakis C, Puig D, Solanas A (2015) A 3: global optimization, 23–55
k-anonymous approach to privacy preserving collaborative filter- De Nooy W, Mrvar A, Batagelj V (2018) Exploratory social network
ing. J Comput Syst Sci 81(6):1000–1011 analysis with Pajek: revised and expanded edition for updated
Chakrabarty N, Chowdhury S, Kanni SD, Mukherjee S (2019) software, vol 46. Cambridge University Press, Cambridge
FAFinder: friend suggestion system for social networking. In: De Santo A, Galli A, Moscato V, Sperlì G (2021) A deep learning
International conference on intelligent data communication tech- approach for semi-supervised community detection in online
nologies and Internet of Things, pp. 51–58. Springer social networks. Knowl-Based Syst 229:107345
Chakraborty T, Dalmia A, Mukherjee A, Ganguly N (2017) Metrics for Deng S, Huang L, Xu G, Wu X, Wu Z (2016) On deep learning for
community analysis: a survey. ACM Comput Surv 50(4):1–37 trust-aware recommendations in social networks. IEEE Trans
Chang J-L, Li H, Bi J-W (2022) Personalized travel recommendation: Neural Netw Learn Syst 28(5):1164–1177
a hybrid method with collaborative filtering and social network Deng Z-H, Qiao H-H, Song Q, Gao L (2019) A complex network com-
analysis. Curr Issue Tour 25(14):2338–2356 munity detection algorithm based on label propagation and fuzzy
Chang Z, Ding D, Xia Y (2021) A graph-based QoS prediction c-means. Phys A 519:217–226
approach for web service recommendation. Appl Intell. pp 1–15 Di Marco A, Navigli R (2013) Clustering and diversifying web search
Chang W, Wu J, Tan CC, Li F (2013) Sybil defenses in mobile social results with graph-based word sense induction. Comput Linguist
networks. In: 2013 IEEE Global Communications Conference 39(3):709–754
(GLOBECOM) Ding J, He X, Yuan J, Chen Y, Jiang B (2018) Community detection by
Chen K, Bi W (2019) A new genetic algorithm for community detec- propagating the label of center. Phys A 503:675–686
tion using matrix representation method. Phys A 535:122259 Duan Z, Zou H, Min X, Zhao S, Chen J, Zhang Y (2019) An adap-
Chen Y, Mo D (2022) Community detection for multilayer weighted tive granulation algorithm for community detection based on
networks. Inf Sci 595:119–141 improved label propagation. Int J Approx Reason 114:115–126
Chen W, Liu Z, Sun X, Wang Y (2010) A game-theoretic framework to Ebrahimi M, Shahmoradi MR, Heshmati Z, Salehi M (2018) A novel
identify overlapping communities in social networks. Data Min method for overlapping community detection using multi-objec-
Knowl Disc 21(2):224–240 tive optimization. Phys A 505:825–835
Chen M-H, Teng C-H, Chang P-C (2015) Applying artificial immune Eremeev AV (2018) On proportions of fit individuals in population of
systems to collaborative filtering for movie recommendation. mutation-based evolutionary algorithm with tournament selec-
Adv Eng Inform 29(4):830–839 tion. Evol Comput 26(2):269–297
Chen J, Wang B, Ouyang Z, Wang Z (2021) Dynamic clustering col- Fang C, Lin Z-Z (2022) Overlapping communities detection based
laborative filtering recommendation algorithm based on double- on cluster-ability optimization. Neurocomputing 494:336–345
layer network. Int J Mach Learn Cybern 12:1097–1113 Fang W, Wang X, Liu L, Wu Z, Tang S, Zheng Z (2022) Community
detection through vector-label propagation algorithms. Chaos
Solitons Fractals 158:112066
Social Network Analysis and Mining (2024) 14:93 Page 43 of 47 93

Feng L, Zhao Q, Zhou C (2021) Incorporating affiliation preference Hinton GE, Salakhutdinov RR (2006) Reducing the dimensional-
into overlapping community detection. Phys A 563:125429 ity of data with neural networks. Science 313(5786):504–507
Fletcher KK, Liu XF (2015) A collaborative filtering method for per- Hosseini-Pozveh M, Ghorbanian M, Tabaiyan M (2022) A label
sonalized preference-based service recommendation. In: Web propagation-based method for community detection in directed
Services (ICWS), 2015 IEEE International Conference On, pp. signed social networks. Phys A 604:127875
400–407. IEEE Hu R, Dou W, Liu J (2014) ClubCF: a clustering-based collaborative
Forsati R, Moayedikia A, Shamsfard M (2015) An effective Web filtering approach for big data application. IEEE Trans Emerg
page recommender using binary data clustering. Inf Retr J Top Comput 2(3):302–313
18(3):167–214 Huang J, Zhang T, Yu W, Zhu J, Cai E (2021) Community detection
Fortunato S (2010) Community detection in graphs. Phys Rep based on modularized deep nonnegative matrix factorization.
486(3–5):75–174 Int J Pattern Recognit Artif Intell 35(02):2159006
Fortunato S, Hric D (2016) Community detection in networks: a user Huang J, Xie Y, Yu F, Ke Q, Abadi M, Gillum E, Mao ZM (2013)
guide. Phys Rep 659:1–44 SocialWatch: detection of online service abuse via large-scale
Francisquini R, Lorena AC, Nascimento MC (2022) Community-based social graphs. In: AsiaCCS
anomaly detection using spectral graph filtering. Appl Soft Com- Jalal S, Yadav DK, Negi CS (2023) Web service discovery with
put 118:108489 incorporation of web services clustering. Int J Comput Appl
Gao P, Wang B, Gong NZ, Kulkarni SR, Thomas K, Mittal P (2018) 45(1):51–62
Sybilfuse: combining local attributes with global structure to Jia C, Li Y, Carson MB, Wang X, Yu J (2017) Node attribute-
perform robust sybil detection. In: 2018 IEEE conference on enhanced community detection in complex networks. Sci Rep
communications and network security (CNS), pp. 1–9. IEEE 7(1):1–15
Gholami M, Sheikhahmadi A, Khamforoosh K, Jalili M (2022) Over- Jiang JQ, McQuay LJ (2012) Modularity functions maximization with
lapping community detection in networks based on neutrosophic nonnegative relaxation facilitates community detection in net-
theory. Phys A 598:127359 works. Phys A 391(3):854–865
Girvan M, Newman MEJ (2002) Community structure in social and Jiang L, Shi L, Liu L, Yao J, Ali ME (2022) User interest community
biological networks. Proc Natl Acad Sci 99(12):7821–7826 detection on social media using collaborative filtering. Wireless
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair Netw. pp 1–7
S, Courville A, Bengio Y (2014) Generative adversarial nets. Jia J, Wang B, Gong NZ (2017) Random walk based fake account
Advances in neural information processing systems 27 detection in online social networks. In: Dependable Systems and
Guerrero M, Montoya FG, Baños R, Alcayde A, Gil C (2017) Adaptive Networks (DSN), 2017 47th Annual IEEE/IFIP International
community detection in complex networks using genetic algo- Conference On, pp. 273–284. IEEE
rithms. Neurocomputing 266:101–113 Jia Z, Yang Y, Gao W, Chen X (2015) User-based collaborative filter-
Gui C, Zhang R, Hu R, Huang G, Wei J (2018) Overlapping communi- ing for tourist attraction recommendations. In: Computational
ties detection based on spectral analysis of line graphs. Phys A intelligence & communication technology (CICT), 2015 IEEE
498:50–65 International Conference On, pp. 22–25. IEEE
Gupta K, Srivastava AV, Raj G (2018) K-mean clustering in web ser- Jin D, Gabrys B, Dang J (2015) Combined node and link partitions
vice quality datasets using AWS and RapidMiner. In: 2018 inter- method for finding overlapping communities in complex net-
national conference on advances in computing and communica- works. Sci Rep 5(1):8600
tion engineering (ICACCE), pp. 201–206. IEEE Jin H, Yu W, Li S (2019) Graph regularized nonnegative matrix tri-
Hagen L, Kahng AB (1992) New spectral methods for ratio cut par- factorization for overlapping community detection. Phys A
titioning and clustering. IEEE Trans Comput Aided Des Integr 515:376–387
Circuits Syst 11(9):1074–1085 Jin D, Yu Z, Jiao P, Pan S, He D, Wu J, Philip SY, Zhang W (2021)
Hajibagheri A, Alvari H, Hamzeh A, Hashemi S (2012) Social net- A survey of community detection approaches: from statisti-
works community detection using the shapley value. In: The 16th cal modeling to deep learning. IEEE Trans Knowl Data Eng
CSI international symposium on artificial intelligence and signal 35(2):1149–1170
processing (AISP 2012), pp. 222–227. IEEE Jonnalagadda A, Kuppusamy L (2016) A survey on game theoretic
Hämäläinen W (2006) Class np, np-complete, and np-hard problems. models for community detection in social networks. Soc Netw
Sort, 1–7 Anal Min 6(1):83
Haq NF, Moradi M, Wang ZJ (2019) Community structure detection Kanavos A, Kotoula P, Makris C, Iliadis L (2019) Employing query
from networks with weighted modularity. Pattern Recogn Lett disambiguation using clustering techniques. Evolving Systems,
122:14–22 1–11
He C, Zhang Q, Tang Y, Liu S, Zheng J (2019) Community detection Kant S, Mahara T (2018) Merging user and item based collaborative
method based on robust semi-supervised nonnegative matrix filtering to alleviate data sparsity. Int J Syst Assur Eng Manag
factorization. Phys A 523:279–291 9(1):173–179
He C, Tang Y, Liu H, Fei X, Li H, Liu S (2019) A robust multi-view Katarya R, Verma OP (2017) An effective web page recommender
clustering method for community detection combining link and system with fuzzy c-mean clustering. Multimed Tools Appl
content information. Phys A 514:396–411 76(20):21481–21496
He C, Zheng Y, Cheng J, Tang Y, Chen G, Liu H (2022) Semi-super- Khanouche ME, Attal F, Amirat Y, Chibani A, Kerkar M (2019) Clus-
vised overlapping community detection in attributed graph tering-based and QoS-aware services composition algorithm for
with graph convolutional autoencoder. Inf Sci 608:1464–1479 ambient intelligence. Inf Sci 482:419–439
He X, Kan M-Y, Xie P, Chen X (2014) Comment-based multi-view Kim J, Lee J-G (2015) Community detection in multi-layer graphs: a
clustering of web 2.0 items. In: Proceedings of the 23rd inter- survey. ACM SIGMOD Rec 44(3):37–48
national conference on world wide web, pp. 771–782. ACM Klein A, Ishikawa F, Honiden S (2012) Towards network-aware service
Hernando A, Bobadilla J, Ortega F (2016) A non negative matrix composition in the cloud. In: Proceedings of the 21st interna-
factorization for collaborative filtering recommender systems tional conference on world wide web, pp. 959–968. ACM
based on a Bayesian probabilistic model. Knowl-Based Syst
97:188–202
93 Page 44 of 47 Social Network Analysis and Mining (2024) 14:93

Koc I (2022) A fast community detection algorithm based on coot Liu Z, Luo X, Wang Z, Liu X (2023) Constraint-induced symmetric
bird metaheuristic optimizer in social networks. Eng Appl Artif nonnegative matrix factorization for accurate community detec-
Intell 114:105202 tion. Inf Fusion 89:588–602
Koohi H, Kiani K (2016) User based collaborative filtering using fuzzy Liu P, Wang X, Che X, Chen Z, Gu Y (2014) Defense against sybil
C-means. Measurement 91:134–139 attacks in directed social networks. In: 2014 19th international
Koohi H, Kiani K (2017) A new method to find neighbor users that conference on digital signal processing
improves the performance of collaborative filtering. Expert Syst Lu H, Sang X, Zhao Q, Lu J (2020) Community detection algorithm
Appl 83:30–39 based on nonnegative matrix factorization and pairwise con-
Laassem B, Idarrou A, Boujlaleb L et al (2022) Label propagation straints. Phys A 545:123491
algorithm for community detection based on coulomb’s law. Phys Lu H, Song Y, Wei H (2020) Multiple-kernel combination fuzzy clus-
A 593:126881 tering for community detection. Soft Comput 24:1–9
Lalwani D, Somayajulu DVLN, Krishna PR (2015) A community Luo M, Xu Y (2022) Community detection via network node vector
driven social recommendation system. In: 2015 IEEE interna- label propagation. Phys A 593:126931
tional conference on big data (big Data), pp. 821–826. IEEE Ma H, Liu Z, Zhang X, Zhang L, Jiang H (2021) Balancing topol-
Lee W-P, Ma C-Y (2016) Enhancing collaborative recommendation ogy structure and node attribute in evolutionary multi-objective
performance by combining user preference and trust-distrust community detection for attributed networks. Knowl-Based Syst
propagation in social networks. Knowl-Based Syst 106:125–134 227:107169
Lei Y, Philip SY (2019) Cloud service community detection for real Ma W, Hu S-Z, Dai Q, Wang T-T, Huang Y-F (2014) Sybil-Resist:
world service networks based on parallel graph computing. IEEE a new protocol for sybil attack defense in social network. In:
Access 7:131355 International conference on applications and techniques in infor-
Lei Y, Zhou Y, Shi J (2019) Overlapping communities detection of mation security, pp. 219–230. Springer
social network based on hybrid c-means clustering algorithm. Malhotra D (2021) Community detection in complex networks using
Sustain Cities Soc 47:101436 link strength-based hybrid genetic algorithm. SN Comput Sci
Li X (2019) Growth curve based label propagation algorithm for com- 2(1):1–16
munity detection. Phys Lett A 383(21):2481–2487 Malhotra D, Chug A (2021) A modified label propagation algorithm
Li M, Liu J (2018) A link clustering based memetic algorithm for for community detection in attributed networks. Int J Inf Manag
overlapping community detection. Phys A 503:410–423 Data Insights 1(2):100030
Li Y-M, Wu C-T, Lai C-Y (2013) A social recommender mechanism Malliaros FD, Vazirgiannis M (2013) Clustering and community detec-
for e-commerce: combining similarity, trust, and relationship. tion in directed networks: a survey. Phys Rep 533(4):95–142
Decis Support Syst 55(3):740–752 Mishra S, Singh SS, Mishra S, Biswas B (2021) TCD2: tree-based
Li J, Wang X, Cui Y (2014) Uncovering the overlapping community community detection in dynamic social networks. Expert Syst
structure of complex networks by maximal cliques. Phys A Appl 169:114493
415:398–406 Misra S, Tayeen ASM, Xu W (2016) SybilExposer: an effective scheme
Li Y, Jia C, Yu J (2015) A parameter-free community detection method to detect Sybil communities in online social networks. In: 2016
based on centrality and dispersion of nodes in complex networks. IEEE international conference on communications (ICC), pp. 1–6
Phys A 438:321–334 Mitchell TM, et al (1997) Machine learning
Li X, Cheng X, Su S, Li S, Yang J (2017) A hybrid collaborative filter- Mohaisen A, Hopper N, Kim Y (2011) Keep your friends close: incor-
ing model for social influence prediction in event-based social porating trust into social network-based Sybil defenses. INFO-
networks. Neurocomputing 230:197–209 COM 11:336–340
Li F, Zhang L, Liu Y, Laili Y, Tao F (2017) A clustering network-based Mokken RJ (1979) Cliques, clubs and clans. Qual Quant 13(2):161–173
approach to service composition in cloud manufacturing. Int J Monderer D, Shapley LS (1996) Potential games. Games Econom
Comput Integr Manuf 30(12):1331–1342 Behav 14(1):124–143
Li Y, Jia C, Li J, Wang X, Yu J (2018) Enhanced semi-supervised Moradi M, Parsa S (2019) An evolutionary method for commu-
community detection with active node and link selection. Phys nity detection using a novel local search strategy. Phys A
A 510:219–232 523:457–475
Li X, Xu G, Tang M (2018) Community detection for multi-layer social Moscato V, Picariello A, Sperli G (2019) Community detection based
network based on local random walk. J Vis Commun Image Rep- on game theory. Eng Appl Artif Intell 85:773–782
resent 57:91–98 Mulamba D, Ray I, Ray I (2016) SybilRadar: a graph-structure based
Li X, Wu X, Xu S, Qing S, Chang P-C (2019) A novel complex net- framework for Sybil detection in on-line social networks. In: IFIP
work community detection approach using discrete particle international information security and privacy conference, pp.
swarm optimization with particle diversity and mutation. Appl 179–193. Springer
Soft Comput 81:105476 Nacer H, Djebari N, Slimani H, Aissani D (2017) A distributed
Li C, Bai J, Wenjun Z, Xihao Y (2019) Community detection using authentication model for composite web services. Comput Secur
hierarchical clustering based on edge-weighted similarity in 70:144–178
cloud environment. Inf Process Manag 56(1):91–109 Nan D-Y, Yu W, Liu X, Zhang Y-P, Dai W-D (2018) A framework
Li S, Jiang L, Wu X, Han W, Zhao D, Wang Z (2021) A weighted of community detection based on individual labels in attribute
network community detection algorithm based on deep learning. networks. Phys A 512:523–536
Appl Math Comput 401:126012 Narayanam R, Narahari Y (2012) A game theory inspired, decentral-
Li B, Wang M, Hopcroft JE, He K (2022) Hosim: higher-order struc- ized, local information based algorithm for community detec-
tural importance based method for multiple local community tion in social graphs. In: Proceedings of the 21st international
detection. Knowl-Based Syst 256:109853 conference on pattern recognition (ICPR2012), pp. 1072–1075.
Li T, He T (2014) Privacy-aware web services selection and composi- IEEE
tion. In: Service Sciences (ICSS), 2014 International Conference Nascimento MCV, Carvalho ACPLF (2011) Spectral methods for graph
On, pp. 147–151. IEEE clustering—a survey. Eur J Oper Res 211(2):221–231
Liu Z, Ma Y (2019) A divide and agglomerate algorithm for commu- Nash JF Jr (1950) The bargaining problem. Econometrica J Econ Soc
nity detection in social networks. Inf Sci 482:321–333 18:155–162
Social Network Analysis and Mining (2024) 14:93 Page 45 of 47 93

Nath K, Shanmugam R, Varadaranjan V (2021) ma-code: a multi-phase Que X, Checconi F, Petrini F, Gunnels JA (2015) Scalable community
approach on community detection in evolving networks. Inf Sci detection with the Louvain algorithm. In: 2015 IEEE interna-
569:326–343 tional parallel and distributed processing symposium, IEEE. pp.
Nema R, Pandey A (2015) Community kernels detection in OSN using 28–37
SVM clustering and classification. Int J Comput Appl. 113(2015) Rahimi S, Abdollahpouri A, Moradi P (2018) A multi-objective par-
Newman ME (2006) Modularity and community structure in networks. ticle swarm optimization algorithm for community detection in
Proc Natl Acad Sci 103(23):8577–8582 complex networks. Swarm Evol Comput 39:297–309
Newman ME (2006) Finding community structure in networks using Ramalingam D, Chinnaiah V, Jeyagobi A (2018) Privacy preserving
the eigenvectors of matrices. Phys Rev E 74(3):036104 schemes for secure interactions in online social networks. In:
Newman M (2010) Networks: an introduction. Oxford University Press, International conference on soft computing systems, pp. 548–
Oxford 557. Springer
Nilizadeh S, Labrèche F, Sedighian A, Zand A, Fernandez J, Kruegel Ramesh A, Srivatsun G (2021) Evolutionary algorithm for overlapping
C, Stringhini G, Vigna G (2017) Poised: Spotting twitter spam community detection using a merged maximal cliques represen-
off the beaten paths. In: Proceedings of the 2017 ACM SIG- tation scheme. Appl Soft Comput 112:107746
SAC conference on computer and communications security, pp. Reihanian A, Feizi-Derakhshi M-R, Aghdasi HS (2023) An enhanced
1159–1174. ACM multi-objective biogeography-based optimization for overlapping
Niu Y, Kong D, Liu L, Wen R, Xiao J (2023) Overlapping community community detection in social networks with node attributes. Inf
detection with adaptive density peaks clustering and iterative Sci 622:903–929
partition strategy. Expert Syst Appl 213:119213 Roghani H, Bouyer A, Nourani E (2021) Pldls: a novel parallel label
Okamoto H, Qiu X (2022) Detecting hierarchical organization of per- diffusion and label selection-based community detection algo-
vasive communities by modular decomposition of Markov chain. rithm based on spark in social networks. Expert Syst Appl
Sci Rep 12(1):20211 183:115377
Okoli C, Schabram K (2015) A guide to conducting a systematic lit- Roozbahani Z, Rezaeenour J, Katanforoush A (2023) Community
erature review of information systems research detection in multi-relational directional networks. J Comput Sci
Paleti L, Radha Krishna P, Murthy J (2021) Approaching the cold- 67:101962
start problem using community detection based alternating least Rossetti G, Cazabet R (2018) Community discovery in dynamic net-
square factorization in recommendation systems. Evol Intel works: a survey. ACM Comput Surv 51(2):1–37
14:835–849 Rostami M, Oussalah M (2022) A novel attributed community detec-
Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlap- tion by integration of feature weighting and node centrality.
ping community structure of complex networks in nature and Online Soc Netw Media 30:100219
society. Nature 435(7043):814 Salha-Galvan G, Lutzeyer JF, Dasoulas G, Hennequin R, Vazirgiannis
Pan W, Chai C (2018) Structure-aware mashup service clustering for M (2022) Modularity-aware graph autoencoders for joint com-
cloud-based Internet of Things using genetic algorithm based munity detection and link prediction. Neural Netw 153:474–495
clustering algorithm. Future Gener Comput Syst 87:267 Samanthula BK, Jiang W (2015) Interest-driven private friend recom-
Patra BK, Launonen R, Ollikainen V, Nandi S (2015) A new similarity mendation. Knowl Inf Syst 42(3):663–687
measure using Bhattacharyya coefficient for collaborative filter- Saranya KG, Sadasivam GS (2017) Modified heuristic similarity meas-
ing in sparse data. Knowl-Based Syst 82:163–177 ure for personalization using collaborative filtering technique.
Pattanayak HS, Sangal AL, Verma HK (2019) Community detection in Appl Math 11(1):307–315
social networks based on fire propagation. Swarm Evol Comput Sattari M, Zamanifar K (2018) A spreading activation-based label
44:31–48 propagation algorithm for overlapping community detection in
Pérez-Peló S, Sanchez-Oro J, Gonzalez-Pardo A, Duarte A (2021) A dynamic social networks. Data Knowl Eng 113:155–170
fast variable neighborhood search approach for multi-objective Sattari M, Zamanifar K (2018) A cascade information diffusion based
community detection. Appl Soft Comput 112:107838 label propagation algorithm for community detection in dynamic
Pham MC, Cao Y, Klamma R, Jarke M (2011) A clustering approach social networks. J Comput Sci 25:122–133
for collaborative filtering recommendation using social network Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008)
analysis. J UCS 17(4):583–604 The graph neural network model. IEEE Trans Neural Networks
Pirasteh P, Hwang D, Jung JE (2015) Weighted similarity schemes 20(1):61–80
for high scalability in user-based collaborative filtering. Mobile Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proxi-
Netw Appl 20(4):497–507 mal policy optimization algorithms. arXiv preprint arXiv:​1707.​
Pizzuti C (2018) Evolutionary computation for community detection 06347
in networks: a review. IEEE Trans Evol Comput 22(3):464–483 Shahmoradi MR, Ebrahimi M, Heshmati Z, Salehi M (2019) Multilayer
Plantié M, Crampes M (2013). In: Ramzan N, Zwol R, Lee J-S, overlapping community detection using multi-objective optimi-
Clüver K, Hua X-S (eds) Survey on social community detection. zation. Futur Gener Comput Syst 101:221–235
Springer, London Shamshirband S, Patel A, Anuar NB, Kiah MLM, Abraham A (2014)
Polatidis N, Georgiadis CK (2017) A dynamic multi-level collaborative Cooperative game theoretic approach using fuzzy q-learning for
filtering method for improved recommendations. Comput Stand detecting and preventing intrusions in wireless sensor networks.
Interfaces 51:14–21 Eng Appl Artif Intell 32:228–241
Pourabbasi E, Majidnezhad V, Afshord ST, Jafari Y (2021) A new Shang R, Liu H, Jiao L, Esfahani AMG (2017) Community min-
single-chromosome evolutionary algorithm for community detec- ing using three closely joint techniques based on community
tion in complex networks by combining content and structural mutual membership and refinement strategy. Appl Soft Comput
information. Expert Syst Appl 186:115854 61:1060–1073
Qie H, Li S, Dou Y, Xu J, Xiong Y, Gao Z (2022) Isolate sets partition Shang R, Zhao K, Zhang W, Feng J, Li Y, Jiao L (2022) Evolution-
benefits community detection of parallel Louvain method. Sci ary multiobjective overlapping community detection based
Rep 12(1):8248 on similarity matrix and node correction. Appl Soft Comput
Qin M, Lei K (2021) Dual-channel hybrid community detection in 127:109397
attributed networks. Inf Sci 551:146–167
93 Page 46 of 47 Social Network Analysis and Mining (2024) 14:93

Shang R, Zhang W, Li Z, Wang C, Jiao L (2023) Attribute community detec- Computational intelligence in data mining. Springer, Singapore,
tion based on latent representation learning and graph regularized pp 659–672
non-negative matrix factorization. Appl Soft Comput 133:109932 Traag VA, Šubelj L (2023) Large network community detection by
Shang J, Liu L, Wu C (2013) WSCN: Web service composition based fast label propagation. Sci Rep 13(1):2701
on complex networks. In: Service Sciences (ICSS), 2013 inter- Tripathi A, Ghosh M, Bharti KK (2021) A new adaptive inertia
national conference on, pp. 208–213. IEEE weight based multi-objective discrete particle swarm optimi-
Shen X, Yao X, Tu H, Gong D (2022) Parallel multi-objective evo- zation algorithm for community detection. In: Machine vision
lutionary optimization based dynamic community detection in and augmented intelligence-theory and applications. Springer,
software ecosystem. Knowl-Based Syst 252:109404 Singapore, pp 287–302
Shi P, He K, Bindel D, Hopcroft JE (2019) Locally-biased spectral Tseng CH, Chen YH, Chuang CC, Wu JH, Yang YS, Liang YW
approximation for community detection. Knowl-Based Syst (2014) Keen-means: a web page clustering tool based on an
164:459–472 self-adjustable k-means algorithm. In: Ubi-media computing
Shi L, Yu S, Lou W, Hou YT (2013) Sybilshield: An agent-aided social and workshops (UMEDIA), 2014 7th international conference
network-based sybil defense among multiple communities. In: On, pp. 300–304. IEEE
INFOCOM, 2013 Proceedings IEEE, pp. 1034–1042. IEEE Umbarkar AJ, Sheth PD (2015) Crossover operators in genetic algo-
Sisodia DS, Verma S, Vyas OP (2017) Augmented intuitive dis- rithms: a review. ICTACT J Soft Comput. 6(1)
similarity metric for clustering of web user sessions. J Inf Sci Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y
43(4):480–491 et al (2017) Graph attention networks. Stat 1050(20):10–48550
Smahi MI, Hadjila F, Tibermacine C, Benamar A (2021) A deep learn- Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput
ing approach for collaborative prediction of web service QoS. 17(4):395–416
SOCA 15:5–20 Wan X, Zuo X, Song F (2020) Solving dynamic overlapping com-
Stringhini G, Mourlanne P, Jacob G, Egele M, Kruegel C, Vigna G munity detection problem by a multiobjective evolutionary
(2015) EVILCOHORT: detecting communities of malicious algorithm based on decomposition. Swarm Evol Comput
accounts on online services. In: 24th USENIX Security Sympo- 54:100668
sium (USENIX Security 15), pp. 563–578. USENIX Associa- Wang Z, Liao J, Cao Q, Qi H, Wang Z (2015) Friendbook: a semantic-
tion, Washington, D.C based friend recommendation system for social networks. IEEE
Su Y, Zhou K, Zhang X, Cheng R, Zheng C (2021) A parallel multi- Trans Mob Comput 14(3):538–551
objective evolutionary algorithm for community detection in Wang Y, Jian X, Yang Z, Li J (2017) Query optimal k-plex based com-
large-scale complex networks. Inf Sci 576:374–392 munity in graphs. Data Sci Eng 2(4):257–273
Sui S-K, Li J-P, Zhang J-G, Sui S-J (2016) The community detection Wang Z, Wang C, Gao C, Li X, Li X (2020) An evolutionary autoen-
based on SVM algorithm. In: 2016 13th international computer coder for dynamic community detection. SCIENCE CHINA Inf
conference on wavelet active media technology and information Sci 63:1–16
processing (ICCWAMTIP), pp. 131–134. IEEE Wang X, Li J, Yang L, Mi H (2021) Unsupervised learning for com-
Sun B-J, Shen H, Gao J, Ouyang W, Cheng X (2017) A non-negative munity detection in attributed networks based on graph convo-
symmetric encoder-decoder approach for community detection. lutional network. Neurocomputing 456:147–155
In: Proceedings of the 2017 ACM on conference on information Wang Y, Bu Z, Yang H, Li H-J, Cao J (2021) An effective and scalable
and knowledge management, pp. 597–606 overlapping community detection approach: integrating social
Sun H, Jie W, Loo J, Wang L, Ma S, Han G, Wang Z, Xing W (2018) identity model and game theory. Appl Math Comput 390:125601
A parallel self-organizing overlapping community detection Wang B, Gu Y, Zheng D (2022) Community detection in error-prone
algorithm based on swarm intelligence for large scale complex environments based on particle cooperation and competition with
networks. Futur Gener Comput Syst 89:265–285 distance dynamics. Phys A 607:128178
Sun Y, Sun X, Liu Z, Cao Y, Yang J (2023) Core node knowledge Wang J, Gao S, Wang L, Yu Z (2018) Micro-Blog Friend-Recommen-
based multi-objective particle swarm optimization for dynamic dation Based on Topic Analysis and Circle Found. In: 2018 IEEE
community detection. Comput Ind Eng 175:108843 fourth international conference on big data computing service
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning and applications (BigDataService), pp. 176–180. IEEE
with neural networks. Advances in neural information process- Wei W, Xu F, Tan CC, Li Q (2012) Sybildefender: defend against sybil
ing systems 27 attacks in large social networks. In: INFOCOM, 2012 Proceed-
Su X, Xue S, Liu F, Wu J, Yang J, Zhou C, Hu W, Paris C, Nepal S, Jin ings IEEE, pp. 1951–1959. IEEE
D et al (2022) A comprehensive survey on community detection Wen S, Yang J, Chen G, Tao J, Yu X, Liu A (2019) Enhancing service
with deep learning. IEEE Trans Neural Netw Learn Syst composition by discovering cloud services community. IEEE
Symeonidis P, Mantas N (2013) Spectral clustering for link prediction Access 7:32472–32481
in social networks with positive and negative links. Soc Netw Wu H-Y, Chen Y-L (2020) Graph sparsification with generative adver-
Anal Min 3(4):1433–1447 sarial network. In: 2020 IEEE international conference on data
Szczepański PL, Barcz AS, Michalak TP, Rahwan T (2015) The game- mining (ICDM), pp. 1328–1333. IEEE
theoretic interaction index on social networks with applications Wu J, Chen L, Feng Y, Zheng Z, Zhou MC, Wu Z (2013) Predicting
to link prediction and community detection. In: Twenty-fourth quality of service for selection by neighborhood-based collabora-
international joint conference on artificial intelligence tive filtering. IEEE Trans Syst Man Cybern Syst 43(2):428–439
Tan E, Guo L, Chen S, Zhang X, Zhao Y (2013) UNIK: unsupervised Wu W, Kwong S, Zhou Y, Jia Y, Gao W (2018) Nonnegative matrix
Social Network Spam Detection. In: Proceedings of the 22nd factorization with mixed hypergraph regularization for commu-
ACM international conference on information & knowledge nity detection. Inf Sci 435:263–281
management Wu Z, Wang X, Fang W, Liu L, Tang S, Zheng H, Zheng Z (2021)
Taştan A, Muma M, Zoubir AM (2021) Sparsity-aware robust com- Community detection based on first passage probabilities. Phys
munity detection (sparcode). Signal Process 187:108147 Lett A 390:127099
Tiwari S, Gupta RK, Kashyap R (2019) To enhance web response time Xiao C, Freeman DM, Hwa T (2015) Detecting clusters of fake
using agglomerative clustering technique for web navigation rec- accounts in online social networks. In: Proceedings of the 8th
ommendation. In: Behera HS, Nayak J, Naik B, Abraham A (eds) ACM workshop on artificial intelligence and security
Social Network Analysis and Mining (2024) 14:93 Page 47 of 47 93

Xiaojun L (2017) An improved clustering-based collaborative filter- Zhang Y, Liu Y, Li J, Zhu J, Yang C, Yang W, Wen C (2020) WOCDA:
ing recommendation algorithm. Clust Comput 20(2):1281–1288 a whale optimization based community detection algorithm. Phys
Xie Y, Wang X, Jiang D, Xu R (2019) High-performance community A 539:122937
detection in social networks using a deep transitive autoencoder. Zhang Y, Liu Y, Jin R, Tao J, Chen L, Wu X (2020) Gllpa: a graph
Inf Sci 493:75–90 layout based label propagation algorithm for community detec-
Xin X, Wang C, Ying X, Wang B (2017) Deep community detection in tion. Knowl-Based Syst 206:106363
topologically incomplete networks. Phys A 469:342–352 Zhang Y, Qiao Y, Liu Z, Geng X, Jia H (2016) A novel multi-granular-
Xu B, Yang D (2015) Study partners recommendation for xMOOCs ity service composition model. In: Asia-Pacific Services Com-
learners. Comput Intell Neurosci 2015:15 puting Conference, Springer. pp. 33–51
Xu R, Che Y, Wang X, Hu J, Xie Y (2020) Stacked autoencoder-based Zhang Z, Sanjeev RK (2014) Detection of shilling attacks in recom-
community detection method via an ensemble clustering frame- mender systems via spectral clustering. In: 17th International
work. Inf Sci 526:151–165 Conference on Information Fusion (FUSION), pp. 1–8. IEEE
Xue J, Yang Z, Yang X, Wang X, Chen L, Dai Y (2013) VoteTrust: lev- Zhang Y, Xiong Y, Ye Y, Liu T, Wang W, Zhu Y, Yu PS (2020) Seal:
eraging friend invitation graph to defend against social network learning heuristics for community detection with generative
Sybils. In: 2013 Proceedings IEEE INFOCOM adversarial networks. In: Proceedings of the 26th ACM SIG-
Yan C, Chang Z (2019) Modularized tri-factor nonnegative matrix factori- KDD international conference on knowledge discovery & data
zation for community detection enhancement. Phys A 533:122050 mining, pp. 1103–1113
Yan C, Chang Z (2020) Modularized convex nonnegative matrix fac- Zhao Z, Ke Z, Gou Z, Guo H, Jiang K, Zhang R (2022) The trade-
torization for community detection in signed and unsigned net- off between topology and content in community detection: an
works. Phys A 539:122904 adaptive encoder-decoder-based nmf approach. Expert Syst Appl
Yan Y, Liu G, Wang S, Zhang J, Zheng K (2017) Graph-based clus- 209:118230
tering and ranking for diversified image search. Multimed Syst Zhao Y, Chen BY, Gao F, Zhu X (2023) Dynamic community detec-
23(1):41–52 tion considering daily rhythms of human mobility. Travel Behav
Yang Z, Xue J, Yang X, Wang X, Dai Y (2016) VoteTrust: leveraging Soc 31:209–222
friend invitation graph to defend against social network sybils. Zheng Z, Ma H, Lyu MRL, King I (2011) Qos-aware web service
IEEE Trans Dependable Secure Comput 13(4):488–501 recommendation by collaborative filtering. IEEE Trans Serv
Yang B, Huang X, Cheng W, Huang T, Li X (2022) Discrete bacte- Comput 4(2):140–152
rial foraging optimization for community detection in networks. Zheng X-L, Chen C-C, Hung J-L, He W, Hong F-X, Lin Z (2015) A
Futur Gener Comput Syst 128:192–204 hybrid trust-based recommender system for online communities
Yang Y, Shi P, Wang Y, He K (2022) Quadratic optimization based of practice. IEEE Trans Learn Technol 8:345
clique expansion for overlapping community detection. Knowl- Zheng N, Song S, Bao H (2015) A temporal-topic model for friend rec-
Based Syst 247:108760 ommendations in Chinese microblogging systems. IEEE Trans
Yi Y, Jin L, Yu H, Luo H, Cheng F (2021) Density sensitive ran- Syst Man Cybern Syst 45(9):1245–1253
dom walk for local community detection. IEEE Access Zheng Z, Ye F, Li R-H, Ling G, Jin T (2017) Finding weighted k-truss
9:27773–27782 communities in large networks. Inf Sci 417:344–360
Yu C, Huang L (2016) A Web service QoS prediction approach based Zhou X, Cheng S, Liu Y (2020) A cooperative game theory-based
on time-and location-aware collaborative filtering. SOCA algorithm for overlapping community detection. IEEE Access
10(2):135–149 8:68417–68425
Yu H, Kaminsky M, Gibbons PB, Flaxman A (2008) Sybilguard: Zhou X, Su L, Li X, Zhao Z, Li C (2023) Community detection based
defending against sybil attacks via social networks. IEEE/ACM on unsupervised attributed network embedding. Expert Syst Appl
Trans Netw 16(3):576–589 213:118937
Yuan S, Zeng H, Zuo Z, Wang C (2023) Overlapping community detec- Zhou L, Lü K, Cheng C, Chen H (2013) A game theory based approach
tion on complex networks with graph convolutional networks. for community detection in social networks. In: British national
Comput Commun 199:62–71 conference on databases, pp. 268–281. Springer
Yuanyuan M, Xiyu L (2018) Quantum inspired evolutionary algo- Zhu X (2006) Semi-supervised learning literature sur-vey. Semi-Super-
rithm for community detection in complex networks. Phys Lett vised Learning Literature Sur-vey, Technical report, Computer
A 382(34):2305–2312 Sciences, University of Wisconsin-Madisoa
Yu D, Wang H, Chen P, Wei Z (2014) Mixed pooling for convolutional Zhu X, Ma Y, Liu Z (2018) A novel evolutionary algorithm on com-
neural networks. In: Rough sets and knowledge technology: 9th munities detection in signed networks. Phys A 503:938–946
international conference, RSKT 2014, Shanghai, China, October Zhu J, Chen B, Zeng Y (2020) Community detection based on modu-
24-26, 2014, Proceedings 9, pp. 364–375. Springer larity and k-plexes. Inf Sci 513:127–142
Žalik KR, Žalik B (2018) Memetic algorithm using node entropy and parti- Zou F, Chen D, Huang D-S, Lu R, Wang X (2019) Inverse modelling-
tion entropy for community detection in networks. Inf Sci 445:38–49 based multi-objective evolutionary algorithm with decompo-
Zhang Z, Li Q (2011) Latent friend recommendation in social network sition for community detection in complex networks. Phys A
services. J China Soc Sci Tech Inf 30(12):1319–1325 513:662–674
Zhang M, Zhou Z (2020) Structural deep nonnegative matrix factori-
zation for community detection. Appl Soft Comput 97:106846 Publisher's Note Springer Nature remains neutral with regard to
Zhang W, He H, Cao B (2014) Identifying and evaluating the internet jurisdictional claims in published maps and institutional affiliations.
opinion leader community based on k-clique clustering. Neural
Comput Appl 25(3):595–602 Springer Nature or its licensor (e.g. a society or other partner) holds
Zhang Z, Liu Y, Ding W, Huang WW, Su Q, Chen P (2015) Proposing a exclusive rights to this article under a publishing agreement with the
new friend recommendation method, FRUTAI, to enhance social author(s) or other rightsholder(s); author self-archiving of the accepted
media providers’ performance. Decis Support Syst 79:46–54 manuscript version of this article is solely governed by the terms of
Zhang W, Shang R, Jiao L (2020) Complex network graph embedding such publishing agreement and applicable law.
method based on shortest path and moea/d for community detec-
tion. Appl Soft Comput 97:106764

You might also like