2021 - CmaGraph - Lin Et Al

CmaGraph: A TriBlocks Anomaly
Detection Method in Dynamic Graph

Using Evolutionary Community
Representation Learning
Weiqin Lin1 , Xianyu Bao2 , and Mark Junjie Li1(B)

1
College of Computer Science and Software Engineering,
Shenzhen University, Shenzhen, China
linweiqin2019@email.szu.edu.cn, jj.li@szu.edu.cn
2
Shenzhen Academy of Inspection and Quarantine, Shenzhen, China
Abstract. Anomaly detection for dynamic graphs, with graphs chang-

ing over time, is essential in many real-world applications. Existing works
did not consider the accurate community structures in a dynamic graph.
This paper introduces CmaGraph, a TriBlocks framework using an inno-
vative deep metric learning block to measure the distances between ver-
tices within and between communities from an evolution community
detection block. A one-class anomaly detection block can capture the
dynamic graph’s anomalous edges after these two functional blocks. This
method significantly enhances the capability to detect anomalous edges
by reconstructing the distance between the evolutionary communities’
vertices. We demonstrate the implications on three real-world datasets
and compare them with the state-of-the-art method.
Keywords: Anomaly detection · Dynamic graph · Evolutionary

community detection · Deep metric learning
1 Introduction
Anomaly detection in a dynamic graph has a wide range of applications, such as

computer networks, economic systems, and social networks [16]. Many anomalies
occur due to significant differences from the previous pattern [3]. For example, if
a computer from a subnet suddenly sends many messages to other computers in
another subnet that it has rarely sent before, the messages may be anomalous in a
computer network. The dynamic graph represents a computer vertex with many
edges connected to the surrounding vertices, resulting in a dense subgraph around
the vertex or generating a community. These updated edges could be abnormal.
A crucial problem over anomaly detection in dynamic graphs is anomalous
edge detection. Edges contain rich features about relationships and structures
Supported by the National key R&D project of China (No. 2018YFC1603601).

c Springer Nature Switzerland AG 2021
I. Farkaš et al. (Eds.): ICANN 2021, LNCS 12891, pp. 105–116, 2021.
https://doi.org/10.1007/978-3-030-86362-3_9
106 W. Lin et al.
[17]. Therefore, finding anomalous edges can be used in security domain, such
as an intrusion detection system, social network anomaly detection, and fault
detection [3]. In this paper, we focus on anomalous edge detection in a dynamic
graph.
Limited work has been done in community structures in dynamic graph
anomaly detection [5]. Many of the existing anomaly detection methods for
the dynamic graph used heuristic rules [1,5,15,15]. These methods heuristically
defined the anomalies features in a dynamic graph and then used the defined
features for anomaly detection. However, heuristic methods are challenging in
adapting to complex and variable patterns of anomalies in large-scale dynamic
graphs. With the popularity of deep learning, there have been many anomaly
detection methods for a dynamic graph using deep learning technologies [21,22].
Compared with traditional heuristic rules, these methods can learn better fea-
tures that can adapt to complex anomaly patterns. However, existing deep learn-
ing anomaly detection methods for dynamic graphs did not consider the dynamic
graph’s community structures.
The main difficulty in anomaly detection based on community structures
is learning accurate community structures by using representation learning in a
dynamic graph. First, attention or community-aware based representation learn-
ing method can transfer the dynamic graphs to feature space. Then using the
above features for anomaly detection. Learning accurate features for anomaly
detection will improve the performance of anomaly detection. Both the clique
embedding of NetWalk [21] and anomalous score layer of AddGraph [22] are
designed for anomaly detection, which all achieve good performance of anomaly
detection. However, in community deep learning methods for dynamic graphs,
existing works were aimed at the general domain and did not consider how to
apply community structures to anomaly detection.
We propose a dynamic graph anomaly detection framework, CmaGraph,
which detects a dynamic graph’s evolution community structures and learns
a community metric enhancement feature for subsequent anomaly detection. It
significantly enhances the capability to detect anomalous edges by reconstructing
the distances between vertices within and between communities. CmaGraph con-
sists of three blocks, Evolution Community Detection Block (C-Block), Commu-
nity Metric Enhancement Block (M-Block), and One Class Anomaly Detection
Block (A-Block). Specifically, the contributions of CmaGraph are as follows:
– CmaGraph detects the evolutionary community structures of dynamic graph.
– CmaGraph uses deep metric learning to learn community metric enhancement
feature for anomaly detection, which significantly enhances the capability to
detect anomalous edges.
– We experiment on three real datasets to prove the effectiveness of CmaGraph.
The rest of this article is organized as follows. We first summarize the related
work in Sect. 2. In Sect. 3, we propose the CmaGraph framework, including the
formula of the method and the anomaly detection process. Then, in Sect. 4,
we conduct experiments on three real datasets and show the performance of
CmaGraph. Finally, we summarize this paper in the Sect. 5.
CmaGraph 107
2 Related Work
Most of the existing methods were based on heuristic rules. GOutlier [1] designed
a reservoir sampling method to maintain a structural summary of the dynamic
graph and dynamically partitioned the graph to build a model of connection
behavior. Then it defined the outliers by the model. CM-Sketch [15] used sketches
to provide constant complexity of time and space, and extracted global and local
structure feature to define outliers. StreamSpot [12] designed a similarity func-
tion of two graphs and used clustering algorithms to distinguish between normal
and anomalous behaviors. GMicro [2] created hash-compressed micro-clusters
from the graph stream by using hash-based edges, which can reduce the size of
the representation. SpotLight [7] encoded the graph by randomly sample vertex
sets and calculating the overlap between vertex sets and vertices of the current
edge set. Finally, it used a clustering algorithm to find an anomalous graph. The
above methods used heuristic rules to define the features of the dynamic graph.
However, the anomalies patterns are variable and complex. Heuristic rules are
challenging in adapting to complex anomalies patterns.
With the development of deep learning, some methods used graph embed-
ding for anomaly detection. Most of existed works learned the static graph
embedding at each timestamp through deep learning techniques [8,14,18]. The
static graph embedding was extended to dynamic graph embedding by aggrega-
tion, sequence model, etc. [10,20]. However, in most cases, these dynamic graph
embedding techniques were aimed at the general domain, and may not work
well in anomaly detection. Therefore, there are some anomaly detection methods
based on dynamic graph embedding recently. NetWalk [21] learned vertex embed-
ding on a random walk sequence set by a custom autoencoder introduced clique
embedding for anomaly detection. AddGraph [22] used Graph Convolutional
Network [11] and Gated Recurrent Unit to capture the structural and temporal
features of dynamic graph respectively, and introduced anomalous score layer
for anomaly detection. These two methods were based on graph embedding for
anomaly detection, which can learn better features, adapt to complex anomalies
patterns, and have better performance than heuristic rules. However, in graph
embedding methods for anomaly detection, existing methods did not consider
the dynamic graph’s community structures. We detect evolutionary community
structures and reconstruct the distances between vertices within and between
communities for anomaly detection.
3 Proposed Method
In this section, we formalize the problem and propose the framework of our
method.
108 W. Lin et al.
3.1 Problem Definition

A dynamic graph G where the element takes the form of G t = (V t , E t ) is a
temporal graph. Here G t is the graph in G at timestamp t, and G = {G t }Tt=1 .
With the update of the graph, the incoming edge set is denoted by E t , and all
vertices in E t are denoted by set V t . We set the entire vertex set V t = ∪ti=1 V i ,
the entire edge set E t = ∪ti=1 E i , n = |V t |, and mt = |E t |. At timestamp t, we
use At ∈ Rn×n to represent the adjacency matrix of G t . We focus on undirected
graph, so At is symmetrical. Given G and timestamp t, our goal is to find
anomalous edges in E t without labelled data. Specifically, this paper outputs
anomalous score vectors {st }Tt=1 where st contains anomalous scores of all edges
in E t , and obtains anomalous edges by setting a threshold.
3.2 CmaGraph Framework

From a global perspective, the main idea of CmaGraph is to detect evolutionary
community structures of G and enhance it for anomaly detection. Figure 1 shows
the overview of CmaGraph. The details of each part of the overview are explained
in the following.
Evolution Community Detection Block (C-Block). The goal of C-Block

is to detect evolutionary community structures. We use adjacency matrices as
the input of autoencoder to get vertex embedding and apply k-means to ver-
tex embedding for community detection. Previous research proves that drastic
variation in the network is not suitable in many real-life dynamic graph [19].
Therefore, inspired by [13] and [19], we introduce sparsity evolution autoencoder
(SeAutoencoder), which can get the stable vertex embedding so that k-means
can get stable community labels. It ensures that the changes of community struc-
tures cannot be changed drastically. Figure 2 shows the inputs and outputs of
C-Block in a synthetic dynamic graph, which shows C-Block can get stable ver-
tex embedding and community labels in dynamic graph.
Formally, at timestamp t, we receive the adjacency matrix At of G t and set
the hyper-parameter k which is the number of communities. We construct vertex
embedding by a ls layers SeAutoencoder which the forward propagation formula
is given by
fsl+1 = σ(fsl Wls + bls ) (1)

where l = 1, . . . , ls − 1,
fs1= A, t
Wls
and bls
are the weight matrix and bias
1
vector of the l-th layer of SeAutoencoder, and sigmoid function σ(z) = 1+exp(z) .
ls
We set Ht = fs 2 . We apply k-means with k communities to Ht , so we can get
a community label vector ct that contains the community label of each vertex.
Here, Ht ∈ Rn×d , d is the dimension of vertex embedding, and ct ∈ Rn . The
reconstruction loss function of SeAutoencoder is
1
fsls − At 2
JAE = F
(2)
2
CmaGraph 109
Fig. 1. The overview of CmaGraph. (a) dynamic graph, (b) adjacency matrices, (c)
Evolution Community Detection Block, (d) Community Metric Enhancement Block,
(e) One Class Anomaly Detection Block.
Fig. 2. Inputs and outputs of C-Block in a synthetic dynamic graph. (a) input graph
G t−1 , (b) output vertex embedding of G t−1 , (c) input graph G t , (d) output vertex
embedding of G t . c1 and c2 are two different communities. Arrows in (d) represent the
direction of movement of vertex embedding compared to (c).
where ·F is frobenius norm. Since the adjacency matrices are sparse, we intro-
duce a sparsity constraint. The penalty term of units of SeAutoencoder is defined
by Kullback-Leibler divergence [13],
ρ 1−ρ
KL(ρρ̂lj ) = ρlog + (1 − ρ)log (3)
ρ̂lj 1 − ρ̂lj
110 W. Lin et al.
Fig. 3. Input and output of M-Block in a synthetic graph. (a) vertex embedding of
G t−1 , (b) community metric enhancement vertex embedding of G t−1 . c1 and c2 are two
different communities.
where ρ is sparsity parameter, ρ̂lj is the average activation of j-th units in the
n
l-th layer, and ρ̂lj = n1 i=1 fij
l
. When the graph is updated, the change of
vertex embedding and community labels should not be too drastic. Therefore,
we introduce a temporal loss JT between Ht and Ht−1 [19] which is given by
1
Ht − Ht−1 2
JT = F
(4)
2
where JT = 0 if t = 1. With ls layers SeAutoencoder, we want to minimize the
final loss function which is given by

ls
JSeAutoencoder = JAE + β KL(ρρ̂lj ) + λJT (5)
l=1 j
where β and λ control the weights of sparsity constraint and temporal loss
respectively.
Community Metric Enhancement Block (M-Block). The goal of M-Block

is to reconstruct the distances between the vertices, which makes the euclidean
distance between vertices in the same community closer to each other, and the
euclidean distance between vertices in different communities farther away from
each other. As shown in Fig. 1d, vertex embedding and community label vec-
tor are the input of M-Block, and the output of M-Block is community metric
enhancement vertex embedding. M-Block uses a community metric enhance-
ment network (CenNet) which is a siamese network [6] for enhancement of ver-
tex embedding, and siamese network is one of deep metric learning methods.
It reconstructs the distances between the vertices within the evolutionary com-
munities. As shown in Fig. 3, the enhancement vertex embedding is better than
original vertex embedding because the euclidean distances between vertices are
more indicative than before.
Formally, at timestamp t, we receive Ht and ct from C-Block. We construct
community metric enhancement vertex embedding Ot ∈ Rn×d by a lc layers fully
connected network CenNet with d units for each layer where forward propagation
formula is given by
fcl+1 = σ(fcl Wlc + blc ) (6)

CmaGraph 111
Here l = 1, . . . , lc − 1, fc1 = Ht , Wlc and blc are the weight matrix and bias
vector of the l-th layer of CenNet respectively, and Ot = fclc . The loss function
of CenNet is contrastive loss which proposed by [6,9], and is given by
1
n n
JCenN et = (yij d2ij + (1 − yij )max(b − dij , 0)2 ) (7)
2n i=1 j=1
where dij = Oti· − Otj· 2 represents the euclidean distance between sample i
and j, Oti· is the i-th row of matrix Ot , yij = 1 if sample i and j are in the same
community or yij = 0, and b is margin. Since n may be too large to make the
calculation of (7) complicated, for a given sample i, instead of going through the
whole dataset to get index j, we use negative sampling to get index j, which can
reduce the complexity.
One Class Anomaly Detection Block (A-Block). Given E t , the goal of

A-Block is to obtain anomalous scores of all edges in E t . As shown in Fig. 1e,
A-Block applies an edge encoder to Ot for getting edge embedding. In A-Block,
given each edge (u, v) in E t and Ot , the edge embedding of (u, v) is exp(−(Otu· −
Otv· )2 ). It can make better use of the distance information of the embedding.
Then A-Block inputs edge embedding into One Class Neural Network (OCNN)
[4] which is an anomaly detection model.
Formally, at timestamp t, we receive Ot of M-Block and E t . Edge encoder
t
φ is an operator to compute edge embedding Pt ∈ Rm ×d by using Ot and E t .
We introduce a la layers fully connected network OCNN with d hidden units
for each hidden layer and its last layer have one unit that represents anomalous
score. The forward propagation formula of OCNN is given by
fal+1 = σ(fal Wla + bla ) (8)

where l = 1, . . . , la − 2. The last layer does not apply activation function which
means fala = fala −1 Wlaa −1 + blaa −1 . Here fa1 = Pt , Wla and bla are the weight
matrix and bias vector of the l-th layer of OCNN, and anomalous score vector
st = fala . The loss function of OCNN is proposed by [4] and is given by
t
la −2
1 1
m
1 la −1 2 1
JOCN N = Wa F + Wa 2 + × t
l 2
max(0, r − sti ) − r (9)
2 2 ν m i=1
l=1
where r is the bias of the hyper-plane. ν controls the number of data points
that are allowed to cross the hyper-plane, and ν is equivalent to the percentage
of anomalies [4]. Finally, we get st and we classify anomalous edges by setting
threshold.
Dynamic Update. Formally, at timestamp t, we get the updated edge set Ω t .

We update the adjacency matrix At according to Ω t , and we use At as the input of
CmaGraph. Then we train the SeAutoencoder, CenNet, and OCNN with learning
rate α and previous weights. Our framework is summarized in Algorithm 1.
112 W. Lin et al.
Algorithm 1. CmaGraph
Input: Graph stream G which contains edge stream {E i }ti=1 , vertex set V t
Parameter: d, α, ρ, β, λ, k, b, φ, ν, r, ls , lc , and la
Output: anomalous score vector {st }Tt=1
1: Define the network structure of SeAutoencoder, CenNet, OCNN.
2: for t=1 to T do
3: Update At according to E t and V t
4: Minimize JSeAutoencoder (5)
ls
5: Ht = fs 2
6: Apply kmeans to Ht with hyper-parameter k to get community label ct
7: Minimize JCenN et (7) with the input Ht , ct
8: Ot = fclc
9: Minimize JOCN N (9) with the input Ot , E t
10: st = fala
11: return {st }Tt=1
4 Experiment
In this section, we show the setup of the experiment and the results compared
with other methods.
4.1 Experiment Setup
Dataset. We evaluate the performance of CmaGraph on the datasets shown in

Table 1. UCI Message is a directed graph which is based on an online community
graph from the University of California where each vertex represents the user
and each edge represents the interactions between users. Digg is based on reply
graphs of the website Digg. Similar to the UCI Message, each vertex represents
the user and each edge represents the reply between the users. DBLP-2010 is
a collaboration network of authors from the computer science bibliography in
2010 where each vertex represents author and each edge represents collaboration
between authors. Since the anomalous data is difficult to obtain, we use the
method of [21] to inject anomalous edges into three datasets.
Table 1. Statistics of datasets
Dataset #Node #Edge Max. Degree Avg. Degree

UCI Message 1,899 13,838 255 14.57
Digg 30,360 85,155 283 5.61
DBLP-2010 300,647 807,700 238 5.37
CmaGraph 113
Baseline. We compare CmaGraph with the following competing edge anomaly

detection methods in dynamic graph.
– GOutlier [1]. It maintains summaries of a graph by designing a sampling

method, defines the outliers of the dynamic graph, and outputs an anomalous
score for a given edge.
– CM-Sketch [15]. It introduces a sketch-base method to approximate the global
and local structural properties of graphs. These approximations are used to
find outliers.
– NetWalk [21]. It uses a vertex reservoir strategy to maintain the summaries
of dynamic graph, uses custom autoencoder to build vertex embedding, and
uses stream k-means to detect anomalous edges.
Experimental Design. We evaluate CmaGraph in two settings: static and

dynamic setting. In static setting, we see whether CmaGraph could effectively
detect community structures and enhance it for anomaly detection without
dynamic updates. In dynamic setting, we split the test set into multiple snap-
shots to see the performance of CmaGraph in dynamic updates. We use AUC
as a metric to compare different methods.
4.2 Experimental Result

Static Setting. For static settings, we use 50% of the data as the normal
edge and use them as the input of CmaGraph for training. We inject 1%, 5%,
10% anomalous edges into the remaining 50% of the data as the test set. The
dimension d of vertex embedding is set to 64. For C-Block, the number of clusters
k is set to 15, the sparsity parameter ρ is set to 0.1, the weight β of sparsity
constraint is set to 0.1, the weight λ of temporal loss is set to 1, and the number
of layer of C-Block ls is set to 3. For M-Block, the parameter b is set to 1, and
the number of layer of CenNet lc is set to 2. For A-Block, ν and r are set to
0.05 and 1 respectively, the number of layer of OCNN la is set to 3, and the
output dimension of OCNN is set to 1. For UCI, the learning rate α of the three
networks is set to 0.0001, and α = 0.00001 for DBLP-2010 and Digg.
Table 2. AUC results in static setting
Methods UCI Messages Digg DBLP-2010

1% 5% 10% 1% 5% 10% 1% 5% 10%
GOutlier 0.7181 0.7053 0.6707 0.6963 0.6763 0.6353 0.7172 0.6891 0.6460
CM-Sketch 0.7270 0.7086 0.6861 0.6871 0.6581 0.6179 0.7097 0.6892 0.6332
Netwalk 0.7758 0.7647 0.7226 0.7563 0.7176 0.6837 0.7654 0.7388 0.6858
CmaGraph 0.9520 0.9574 0.9523 0.9117 0.9124 0.9178 0.8131 0.8148 0.8157
The results of CmaGraph and baselines are shown in Table 2. Because UCI
and Digg are the same as those used by Netwalk, and DBLP-2010 is similar
114 W. Lin et al.
to DBLP dataset used by Netwalk, we use the results of baselines reported by

Netwalk [21]. The results of CmaGraph are obtained by averaging 10 times and
all variances are less than 0.001. CmaGraph surpasses all the other methods in
all of the datasets. On UCI and Digg, CmaGraph has at least 0.1554 increment
compared to the baselines. On DBLP-2010, it has at least 0.0477 increment
compared to the baselines. Significant performance improvement is mainly due
to CmaGraph can effectively detect community structures and enhance it for
anomaly detection by using deep metric learning, and the learned features can
adapt to complex anomalies patterns. It also demonstrates community structures
can be effectively applied to graph anomaly detection by deep metric learning.
Dynamic Setting. For dynamic settings, we split test set into multiple snap-
shots. Averagely, we split 6, 7 and 10 snapshots for UCI, Digg and DBLP-2010
respectively. For each snapshot, we update CmaGraph according to Algorithm 1.
The hyper parameters are the same as the static setting. Figure 4 reports the
result of dynamic setting where the results of baselines are reported by NetWalk
[21] and the results of CmaGraph are obtained by averaging 10 times and all
variances are less than 0.001. We see that CmaGraph exceeds other baselines on
all the datasets. On UCI, Digg and DBLP-2010, CmaGraph has at least 0.16,
0.08, 0.0045 increment compared to Netwalk respectively. The main reason that
CmaGraph beats the baselines on all snapshots of all datasets is that CmaGraph
can detect structural features of evolutionary communities and steadily enhance
the features for anomaly detection. This also demonstrates that CmaGraph can
learn the evolution community structures which can adapt to complex anomalous
patterns.
Fig. 4. AUC results in dynamic setting with 5% anomalies
Stability of CmaGraph over Different Percentages of Training Data.

In this part, we test the performance of CmaGraph at different percentages of
training data. In each percentage, with 5% anomalous edges and parameters in
static setting, we run 20 times to get Fig. 5 on dataset Digg. We can see that
the AUC of CmaGraph increases gradually with the increase of the percentage.
CmaGraph 115
Fig. 5. Stability on Digg with different training percentages
From 10% to 20% training percentage, AUC increases the most. After 20% train-
ing percentage, the AUC grows steadily. In different training percentages, the
standard deviations are between 0.0003 and 0.0008, which shows the stability
of CmaGraph. Even we use 10% training percentage of Digg, CmaGraph also
exceeds the best performance of baselines in static settings of 5% anomalous
edges, which shows that CmaGraph can achieve good performance in the case
of a small number of data.
5 Conclusion
In this paper, we propose the CmaGraph framework that can detect anomalous
edges in a dynamic graph. CmaGraph uses three blocks to effectively detect
evolutionary community structures and enhance it for anomaly detection. It
significantly enhances the capability to detect anomalous edges by reconstruct-
ing the distances between the evolutionary communities’ vertices. We conduct
experiments based on three real-world datasets, and the results demonstrate the
effectiveness and stability of CmaGraph, and CmaGraph has an outperformance
than existing methods in dynamic graph anomaly detection.
References
1. Aggarwal, C.C., Zhao, Y., Philip, S.Y.: Outlier detection in graph streams. In:
2011 IEEE 27th International Conference on Data Engineering, pp. 399–409. IEEE
(2011)
2. Aggarwal, C.C., Zhao, Y., Yu, P.S.: On clustering graph streams. In: Proceedings
of the 2010 SIAM International Conference on Data Mining, pp. 478–489. SIAM
(2010)
3. Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: a survey (2019).
arXiv preprint arXiv:1901.03407
4. Chalapathy, R., Menon, A.K., Chawla, S.: Anomaly detection using one-class neu-
ral networks((2018)). arXiv preprint arXiv:1802.06360
5. Chen, Z., Hendrix, W., Samatova, N.F.: Community-based anomaly detection in
evolutionary networks. J. Intell. Inf. Sys. 39(1), 59–85 (2012)
116 W. Lin et al.
6. Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively,
with application to face verification. In: 2005 IEEE Computer Society Conference
on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 539–546.
IEEE (2005)
7. Eswaran, D., Faloutsos, C., Guha, S., Mishra, N.: Spotlight: detecting anomalies
in streaming graphs. In: Proceedings of the 24th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining, pp. 1378–1386 (2018)
8. Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Pro-
ceedings of the 22nd ACM SIGKDD International Conference on Knowledge Dis-
covery and Data Mining, pp. 855–864 (2016)
9. Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invari-
ant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision
and Pattern Recognition (CVPR 2006). vol. 2, pp. 1735–1742. IEEE (2006)
10. Kazemi, S.M., et al.: Relational representation learning for dynamic (knowledge)
graphs: a survey (2019). arXiv preprint arXiv:1905.11485
11. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional
networks(2016). arXiv preprint arXiv:1609.02907
12. Manzoor, E., Milajerdi, S.M., Akoglu, L.: Fast memory-efficient anomaly detection
in streaming heterogeneous graphs. In: Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, pp. 1035–
1044 (2016)
13. Ng, A., et al.: Sparse autoencoder. CS294A Lect. Notes 72(2011), 1–19 (2011)
14. Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social represen-
tations. In: Proceedings of the 20th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, pp. 701–710 (2014)
15. Ranshous, S., Harenberg, S., Sharma, K., Samatova, N.F.: A scalable approach
for outlier detection in edge streams using sketch-based approximations. In: Pro-
ceedings of the 2016 SIAM International Conference on Data Mining, pp. 189–197.
SIAM (2016)
16. Ranshous, S., Shen, S., Koutra, D., Harenberg, S., Faloutsos, C., Samatova, N.F.:
Anomaly detection in dynamic networks: a survey. Wiley Interdiscipl. Rev. Com-
put. Stat. 7(3), 223–247 (2015)
17. Rossetti, G., Cazabet, R.: Community discovery in dynamic networks: a survey.
ACM Comput. Surv. (CSUR) 51(2), 1–37 (2018)
18. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: Large-scale infor-
mation network embedding. In: Proceedings of the 24th International Conference
on World Wide Web, pp. 1067–1077 (2015)
19. Wang, Z., Wang, C., Gao, C., Li, X., Li, X.: An evolutionary autoencoder for
dynamic community detection. Sci. China Inf. Sci. 63(11), 1–16 (2020). https://
doi.org/10.1007/s11432-020-2827-9
20. Yao, L., Wang, L., Pan, L., Yao, K.: Link prediction based on common-neighbors
for dynamic social network. Procedia Comput. Sci. 83, 82–89 (2016)
21. Yu, W., Cheng, W., Aggarwal, C.C., Zhang, K., Chen, H., Wang, W.: Netwalk: a
flexible deep embedding approach for anomaly detection in dynamic networks. In:
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining, pp. 2672–2681 (2018)
22. Zheng, L., Li, Z., Li, J., Li, Z., Gao, J.: Addgraph: anomaly detection in dynamic
graph using attention-based temporal GCN. In: IJCAI, pp. 4419–4425 (2019)

2021 - CmaGraph - Lin Et Al

Uploaded by

Copyright:

Available Formats

You might also like

2021 - CmaGraph - Lin Et Al

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2021 - CmaGraph - Lin Et Al

Uploaded by

Copyright:

Available Formats

CmaGraph: A TriBlocks Anomaly

Detection Method in Dynamic Graph

Weiqin Lin1 , Xianyu Bao2 , and Mark Junjie Li1(B)

Abstract. Anomaly detection for dynamic graphs, with graphs chang-

Keywords: Anomaly detection · Dynamic graph · Evolutionary

Anomaly detection in a dynamic graph has a wide range of applications, such as

Supported by the National key R&D project of China (No. 2018YFC1603601).

3.1 Problem Deﬁnition

3.2 CmaGraph Framework

Evolution Community Detection Block (C-Block). The goal of C-Block

fsl+1 = σ(fsl Wls + bls ) (1)

Community Metric Enhancement Block (M-Block). The goal of M-Block

fcl+1 = σ(fcl Wlc + blc ) (6)

One Class Anomaly Detection Block (A-Block). Given E t , the goal of

fal+1 = σ(fal Wla + bla ) (8)

Dynamic Update. Formally, at timestamp t, we get the updated edge set Ω t .

4.1 Experiment Setup

Dataset. We evaluate the performance of CmaGraph on the datasets shown in

Table 1. Statistics of datasets

Dataset #Node #Edge Max. Degree Avg. Degree

Baseline. We compare CmaGraph with the following competing edge anomaly

– GOutlier [1]. It maintains summaries of a graph by designing a sampling

Experimental Design. We evaluate CmaGraph in two settings: static and

4.2 Experimental Result

Table 2. AUC results in static setting

Methods UCI Messages Digg DBLP-2010

to DBLP dataset used by Netwalk, we use the results of baselines reported by

Fig. 4. AUC results in dynamic setting with 5% anomalies

Stability of CmaGraph over Diﬀerent Percentages of Training Data.

Fig. 5. Stability on Digg with diﬀerent training percentages

You might also like