Adjusted Community-Aware Attributed Graph Anomaly Detection

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

ACGA:Adjusted Community-Aware Attributed Graph

Anomaly Detection

Yuze Ge Yushuo Li
School of Data Science School of Data Science
Fudan University Fudan University
Shanghai, CN 200000 Shanghai, CN 200433
19307130176@fudan.edu.cn 19307110230@fudan.edu.cn

Yuanye Liu
School of Data Science
Fudan University
Shanghai, CN 200433
19307130235@fudan.edu.cn

Abstract
In this project, we utilize the graph neural network for anomaly detection of
the attributed graph. We employed the Dominat, AnomalyDAE and ComGA
models, the first two focusing on the attributes and structural characteristics of the
graph, and the last taking the community structure of the graph into consideration.
Then, inspired by AnomalyDAE, we refine the ComGA model as ACGA so that
properties interact more directly with community structure. Experiments show that
our improved model ACGA is competitive. Finally, we visualized and analyzed
the results.

1 Introduction
A large number of abnormal users exist in social networks.The behavior of an abnormal user is quite
different from that of an ordinary user, like publishing vast quantities of ads or false information.
Attributed networks provide a potent tool to model a wide range of complex systems, such as social
media networks.In social networks, a user’s id can be viewed as a vertex in an attributed graph. Users
are connected with each other by performing various social activities, and the corresponding vertices
in the attributed graph are connected by edges. Rich profile information of a user such as age, gender,
and income can be viewed as features of the corresponding vertex.
Graph anomaly detection aims to find rare behaviours that significantly deviate from the majority
of vertices. As shown in Figure 1, graph anomalies can be divided into three categories: local,
global, and structure anomalies. When considering attribute information in the whole graph, user 1 is
global anomaly since its attribute (a2) value is significantly higher than others. When considering
attribute information within one community ( e.g., C2), user 6 is local anomaly as its attribute
(a3) value relatively deviates from other users within C2. When considering structure information
across different communities, users 5 and 7 are structure anomaly because they have link with other
communities while other users in their community do not have cross-community links.
Graph neural networks (GNNs) have been used in graph anomaly detection and achieved great effect.
However, there are still some drawbacks. First, convolution operation of GNNs can only capture
neighbors’ information of a vertex so global anomaly become difficult to detect. Secondly, GNNs
only focus on the structure of neighborhood and ignore the structure of community. In this project, to
alleviate the above-mentioned problems, we take community into account in graph anomaly detection.
In detail, we make use of autoencoder to encode and decode modularity matrix and fuse the hidden
state of the autoencoder and the layers of GNNs. We reconstruct the attributed graph with the hidden
states of the decoders.
The division of our work is as follows:

• Ge Yuze: Analysis of ComGA, presentation, report


• Li Yushuo: Analysis of AnomalyDAE and ACGA, visualization, report
• Liu Yuanye: Analysis of Dominant, code implement and experiments, report

Figure 1: A toy example is a social network graph with different types of graph anomalies

2 Related Work
2.1 Attribute Network

In social networks, users not only are connected with each other by performing various social
activities but also are affiliated with rich profile information. Attributed networks provide a potent
tool to handle the data heterogeneity that we are often confronted in vast amounts of social networks
and they not only have node-to-node interactions observed, but also encode a rich set of features for
each node [1, 3].
It has been widely studied and received that there exhibits a strong correlation among the attributes
of linked nodes [2, 4], which can be attributed to social influence and homophily effect in social
science theories [5, 6]. Therefore, some important information might hide in the interactions between
node attributes and network structure. To advance graph node anomaly detection, there are needs to
capture and leverage the attributes, so high-quality embedding from which we wish to reflect potential
information in social networks shows great importance. And the methods in our project seamlessly
models the attributed network and efficiently learn a low-dimensional node vector representation for
each node in the network.

2.2 Anomaly Detection

Anomaly detection on attributed networks aims at finding nodes whose patterns deviate significantly
from the majority of reference nodes, which is pervasive in many applications such as social spammer
detection. Early work on graph anomaly detection has been largely dependent on dedicated domain
knowledge and statistical methods, where features for detecting anomalies have been mostly hand-
crafted [7]. This handcrafted method is both time-consuming and labor-intensive, while real-world
graphs often contain a very large number of nodes and edges labeled with a large number of at-
tributes, and are thus large-scale and high-dimensional [7]. To overcome this problem, deep learning
approaches in detecting anomalies from graphs can examine large-scale high-dimensional data and
extract patterns from the data, thereby achieving satisfactory performance without the burden of
handcrafting features [8].
There are already a mount of works about graph anomaly detection based on deep learning. These
works fall into two broad categories: supervised methods or unsupervised(self-supervised) methods.
As there is no ground truth of anomalies in most of the large anomaly detection datasets, the supervised

2
methods seem unavailable and impractical. Our project concentrates on unsupervised methods, to be
specific, autoencoder. We use our models to extracts node attribute information as well as structural
information from a static attributed graph and evaluates the anomaly score of nodes using a certain
scoring algorithm.

2.3 DOMINANT

DOMINANT[9] is a model introduced in Deep Anomaly Detection on Attributed Networks, which


make the first investigation on the research problem of anomaly detection on attributed networks by
developing a carefully designed deep learning model. Specifically, they address the limitations of
existing methods and model the attributed networks with graph convolutional network (GCN). As
GCN handles the high-order node interactions with multiple layers of nonlinear transformations, it
alleviates the network sparsity issue and can capture the nonlinearity of data as well as the complex
interactions between two sources of information on attributed networks. To further enable the
detection of anomalous nodes, they introduce a deep autoencoder framework to reconstruct the
original attributed network with the learned node embeddings from GCN. The reconstruction errors
of nodes are then employed to flag anomalies. The experimental results demonstrate the superiority
of the proposed deep model over the state-of-the-art methods.

Figure 2: The framework of DOMINANT

2.4 AnomalyDAE

Although Dominant has been successful in some graph anomaly detection tasks, it neglect the
complex interactions between nodes and attributes by only learning the representations for nodes
[9], while interactions between two different modality sources are of great importance for anomaly
detection task to capture both structure and attribute induced anomalies. To alleviate the above-
mentioned problems, the paper proposes a deep joint representation learning framework for anomaly
detection through a dual autoencoder (AnomalyDAE), which captures the complex interactions
between network structure and node attribute for high-quality embeddings[10].
Specifically, AnomalyDAE consists of a structure autoencoder and an attribute autoencoder with
respective encoders, but decoders jointly learn node embedding and attribute embedding. The
structural encoder extracts the structural information of nodes and their neighbor nodes by using
the graph attention mechanism. The attribute encoder extracts the attribute information. Then the
attribute embedding feature vector interacts with the structural embedding feature vector to jointly
learn the potential representation of nodes and attributes. By taking both the node embedding and
attribute embedding as inputs of attribute decoder, the cross-modality interactions between network
structure and node attribute are learned during the reconstruction of node attribute.

3 Methodology
3.1 Motivation

The community structure contains rich global graph information, both of which are not taken into
account in the previous model. And the convolution operation of GNNs can only capture neighbors’
information of a vertex. More layers of GCNs can of course capture long distance of neighbors for
global anomaly, this will suffer from the over-smoothing of vertex representations.
To alleviate the above-mentioned problems, the following methods are used in the model ComGA[11].
Firstly, we use the autoencoder to reconstruct the modularity matrix. This helps the model to

3
Figure 3: The framework of AnomalyDAE

learn about the community information of the graph. Secondly, design multiple gateways in tGCN
module to propagates community-specific representation into its corresponding layers of GCN. In
this way, local and structure anomalies information is fused in each vertices representation and
over-smoothing of vertices representations is alleviated. This makes anomalous vertex representations
more distinguishable and anomaly-aware vertex representations easier to be learned for multiple
anomalies.
Inspired by AnomalyDAE, we think the interaction of structure information and attribute information
may be very important for anomaly detection. Therefore, we combine the structural embedding
feature of community information in ComGA model to interact with the attribute embedding feature,
so as to capture the potential relationship between them. The overall framework of our model ACGA
for deep anomaly detection on attributed networks is shown in Figure 2.

Figure 4: overall framework of ACGA

3.2 Community and Modularity Matrix

Definition 3.1. Modularity B is a measure of how well a network is partitioned into communities
estimated by the difference between the number of actual and expected edges within group s expected.
ki kj
bi,j = ai,j −
2m
1
P
where A is the adjacency matrix, ki is the degree of the vetex vi and m = 2 i ki .

3.3 Community Detection Module

We employ the autoencoder to learn the community-specific representation. The autoencoder is consist
of an encoder and a decoder, both of which are deep GNNs. The encoder learns the representation of
the modularity matrix, and the hidden state of l − th layer is:
Hl = ϕ(Wel Hl−1 + ble )

4
where ϕ indicates the activation function, Wel and ble are learnable parameters. The input H0 is
modularity matrix B.
The decoder takes the last layer of the encoder as the input. And it aims to reconstruct the original
data from the latent representation:
′ ′
Hl = ϕ(Wdl Hl−1 + bld )
The reconstruction loss of the autoencoder is as follows:
Lres = ∥B − B̂∥2F

3.4 tGCN Module

We first utilize tGCN to encode the attributed graph. The vertex representation of the l − th layer of
the tGCN is as follows:  
1 1
Zl = ϕ D̃− 2 ÃD̃− 2 Zl−1 Wl−1
P
where à = A + I, D̃ii = j Ãij , I is the identity matrix.
In order to fuse community information into the tGCN module, a gateway is designed to combine the
two representations of community and attributed graph:
Z̃l = H̃l + Z̃l
˜
We replace Zl−1 with Zl−1 new representation Zl :
 1 1

Zl = ϕ D̃− 2 ÃD̃− 2 Z̃l−1 Wl−1

The attribute matrix X is the input of the first layer in tGCN. Note that by combing the representations
of community and attributed graph, the community structure information will be propagated to each
vertices’ representation.
The community-guide loss aims to integrate the autoencoder model and tGCN model:
Lgui = KL(Z||H)
where Z is the final layer of tGCN and H is the input of the community decoder.

3.5 Anomaly Detection Module

We utilize the learned latent representation Z to reconstruct the topology structure and the attributed
graph.
We reconstruct the original graph structure A:
 
 = sigmoid ZZT
Specifically, we predict whether there is a link between two nodes:
 
p Âij = 1 | zi , zj = sigmoid zi , zTj


Then we reconstruct the attributed matrix. We use the graph convolutional layer as the attribute
decoder as follows:  
1 1
X̂ = ϕ D̃− 2 ÃD̃− 2 ZWad

The objective function can be formulated as:


Lrec = (1 − α)∥A − Â∥2F + α∥X − X̂∥2F
where α is the parameter which controls the balance between structure reconstruction and attributes
reconstruction.
The anomaly score of node vi is determined according to:
2 2
Score vi = (1 − α) ∥ai − âi ∥2 + α ∥xi − x̂i ∥2
The joint loss function as follows:
L = Lres + Lgui + Lrec

5
3.6 Adjusted Attribute Decoder

Here we use the attribute encoder of AnomalyDAE as our ACGA attribute encoder:

Ye = σ (X)T WA(1) + b A(1) )

e A(2) + bA(2)
Y = YW

Attribute decoder takes both the structural embedding with community information learned by the
tGCN community encoder, and the attribute embedding learned by AnomalyDAE attribute encoder
as inputs for decoding of the original node attribute matrix:

X̂ = ZYT
in which, the interactions between network structure and node attribute are jointly captured.

4 Experiments

4.1 Dataset

We test our model on three different datasets.

• BlogCatelog:BlogCatalog is a blog sharing website. The bloggers in blogcatalog can follow


each other forming a social network. Users are associated with a list of tags to describe
themselves and their blogs, which are regarded as node attributes.
• Flickr:Flickr is an image hosting and sharing website. Similar to BlogCatalog, users can
follow each other and form a social network. Node attributes of users are defined by their
specified tags that reflect their interests.
• ACM:ACM is an attributed network from academic field. It is a citation network where each
paper is regarded as a node on the network, and the links are the citation relations among
different papers. The attributes of each paper are generated from the paper abstract.

4.2 Performance

The performance is as shown in the table 1

Dataset ACM Flickr BlogCatelog


DOMINANT 76.01 74.68 74.42
AnomalyDAE 75.13 78.34 75.08
ComGA 84.96 81.40 79.91
ACGA(Ours) 85.02 80.20 72.06
Table 1: AUC value on three datasets

Because we have little time and limited computing resource to adjust the parameters and train our
ACGA model, we didn’t get its best performance. But as you can see, our model has a small
improvement in ACM dataset. We can imagine and wish our model have the potential to outperform
other three methods.

4.3 Visualization

Firstly, we use Louvain algorithm to partition the community of three datasets:


The three communities can be effectively separated according to the community, which is conducive
to our subsequent analysis about the separate extraction of two communities.
Then, we use Gephi to get the graph statistics for each of our datasets:

6
(a) ACM (b) BlogCatelog (c) Flickr

Figure 5: Data set community visualization

Dataset BlogCatelog Flickr ACM


Nodes 5196 7576 16485
Edges 345519 482555 164350
Anomalies 300 450 600
Average Clustering Coefficient 0.123 0.331 0.46
Average Path Length 2.512 2.413 5.468
Modularity 0.365 0.243 0.746
Graph density 0.013 0.008 0.001
Table 2: Statistics on three datasets

From the table 2, we choose dataset Flickr to visualize the results of model. Here are four reasons: the
average clustering coefficient is the largest among three datasets, so the network data accords with the
characteristics of Flickr social network, in which users can follow each other like in the real world; the
average path is the shortest among three datasets, showing a "small world phenomenon"; high degree
of modularity, the nodes within the module are closely connected, and the nodes between different
modules are sparsely connected; the graph density is small and the overall network is sparse. The
advantages above is typical characters in a real world social media network, which is advantageous
for our analysis. Although the degree of modularity and density of Flickr are worse than ACM dataset,
the Flickr is a real social media network.

(a) Flickr dataset (b) Flickr dataset with


anomaly detection

Figure 6: Visualization of anomaly detection

Select certain community A and community B, and select one of the anomalies. It can be found that
this anomaly connects a large number of nodes of the two communities A, B at the same time, which
belongs to the community structure anomaly in our theory. Because of the community encoder which
extracts the community structural information, our model can detect the anomaly node clearly.

(a) Community A (b) Community B (c) Community A+B (d) Anomaly node’s net-
work

Figure 7: Community and a anomaly node network

7
We also select two normal nodes in community A and B, and find that compared with anomaly nodes,
their neighbor networks are basically in the same community with fewer neighbors. That is typical
difference within anomaly nodes and normal nodes.

(a) Community A+B (b) Community A+B (c) Normal node’s net- (d) Normal node’s net-
work work

Figure 8: Comparison of anomaly and normal nodes

5 Conclusion
To improve the performance of anomaly detection, in this project, we propose a novel framework
(ACGA) based on community-aware attributed graph anomaly detection framework (ComGA).
Specifically, we introduce the interation module inspired by AnomalyDAE.
In the future, we would like to adjust the training parameters to improve the performance and complete
the whole network structure.

References

[1] Li, J., Dani, H., Hu, X., Tang, J., Chang, Y., & Liu, H. (2017) Attributed network embedding for learning
in a dynamic environment. In Proceedings of the 2017 ACM on Conference on Information and Knowledge
Management, (pp. 387-396).
[2] Pfeiffer III, J. J., Moreno, S., La Fond, T., Neville, J., & Gallagher, B. (2014) Attributed graph models:
Modeling network structure with correlated attributes. In Proceedings of the 23rd international conference on
World wide web, (pp. 831-842).
[3] Akoglu, L., Tong, H., Meeder, B., & Faloutsos, C. (2012) Pics: Parameter-free identification of cohesive
subgroups in large attributed graphs. In Proceedings of the 2012 SIAM international conference on data mining ,
(pp. 439-450).
[4] Shalizi, C. R., & Thomas, A. C. (2011) Homophily and contagion are generically confounded in observa-
tional social network studies. In Sociological methods & research, 40 , (pp. 211-239).
[5] Marsden, P. V., & Friedkin, N. E. (1993) Network studies of social influence. In Sociological Methods &
Research , (pp. 127-151).
[6] McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001) Birds of a feather: Homophily in social networks.
In Annual review of sociology , (pp. 415-444).
[7] Kim, H., Lee, B. S., Shin, W. Y., & Lim, S. (2022) Graph Anomaly Detection with Graph Neural Networks:
Current Status and Challenges. In IEEE Access .
[8] Längkvist, M., Karlsson, L., & Loutfi, A. (2014) A review of unsupervised feature learning and deep
learning for time-series modeling. In Pattern Recognition Letters, 42 , (pp. 11-24).
[9] Ding, K., Li, J., Bhanushali, R., & Liu, H. (2019) Deep anomaly detection on attributed networks. In
Proceedings of the 2019 SIAM International Conference on Data Mining , (pp. 594-602).
[10] Fan, H., Zhang, F., & Li, Z (2020) AnomalyDAE: Dual autoencoder for anomaly detection on attributed
networks. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing
(ICASSP) , (pp. 5685-5689).
[11] Luo, X., Wu, J., Beheshti, A., Yang, J., Zhang, X., Wang, Y., & Xue, S. (2022) ComGA: Community-Aware
Attributed Graph Anomaly Detection. In Proceedings of the Fifteenth ACM International Conference on Web
Search and Data Mining , (pp. 657-665).

You might also like