Professional Documents
Culture Documents
Cross Lingual Knowledge Graph Entity Alignment by Aggregating Extensive Structures and Specifc Semantics
Cross Lingual Knowledge Graph Entity Alignment by Aggregating Extensive Structures and Specifc Semantics
Cross Lingual Knowledge Graph Entity Alignment by Aggregating Extensive Structures and Specifc Semantics
https://doi.org/10.1007/s12652-022-04319-5
ORIGINAL RESEARCH
Received: 17 September 2021 / Accepted: 6 July 2022 / Published online: 18 July 2022
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022
Abstract
Entity alignment aims to link entities from different knowledge graphs (KGs) that refer to the same real-world identity.
Recently, embedding-based approaches that primarily center on topological structures get close attention in this field. Even
achieving promising performance, these approaches overlook the vital impact of entity-specific semantics on entity alignment
tasks. In this paper, we propose a new framework SSEA (Extensive Structures and Specific Semantics for Entity Alignment),
which jointly employs extensive structures and specific semantics to boost the performance of entity alignment. Specifi-
cally, we employ graph convolution networks (GCNs) to learn the representations of entity structures. Besides considering
entity representations, we also explore relation semantics by approximating relation embeddings based on head entity and
tail entity representations. Moreover, attribute semantics are also learned by GCNs while they are independent of joint
entity and relation embeddings. The results of structure, relation, and attribute representations are concatenated for better
entity alignment. Experimental results on three benchmark datasets from real-world KGs demonstrate that our approach
has achieved promising performance in most cases. Notably, SSEA has achieved 91.78 and 97.20 for metrics Hits@1 and
Hits@10 respectively on the DBP15KFR−EN dataset.
1 Introduction
13
Vol.:(0123456789)
12610 B. Zhu et al.
13
Cross‑lingual knowledge graph entity alignment by aggregating extensive structures and… 12611
representation of the entities’ extensive structures. It uses layer, and the output of the previous layer is used as input
entity representations to approximate relation embeddings to the next layer. Specifically, let H(l)
s
denote the entity node
for relation semantics. Then it computes the joint representa- representation in lth GCN layer, and the hidden state update
tions of the entity based on entity embeddings and relation formula is calculated:
embeddings. In the second part, SE module learns attribute ( )
semantic information from GCNs. Then we combine entity H(l+1)
s
= ReLU LH(l) s
Ws(l) , (1)
embeddings, relation embeddings, and attribute embeddings
where L is the adjacency matrix of the graph. To realize
together to generate the embed-based similarity matrix. In
symmetry normalization, L is set as D ̂ − 2, D
̂ is the diag-
1 1
̂− 2 ÂD
the third part, for entity h12 to be aligned, the framework cal-
onal degree matrix of node degrees, A ̂ = A + I and I is the
culates the embedding-based similarities between it and all
identity matrix A, on the basis of adjacency matrix A, each
candidate entities. The entity most similar to h12, h22, is used
entity pair adds a self-loop to consider the characteristics of
as the output. We assume there are one-to-one alignments
the node itself; Ws(l) ∈ d(l) × d(l+1) is a weight matrix of the
between testing source entities and testing target entities in
lth layer, d(l+1) is the number of features in the (l + 1)th layer.
this paper.
ReLU is an activation function.
The deeper the GCNs, the more difficult the network
3.2 Joint entity and relation embedding module
training. To cope with the challenge, we use layer-wise
highway gates like RDGCN, which let more information be
3.2.1 Entity embedding
returned directly to the input without a nonlinear transforma-
tion. Specifically, two nonlinear transformation layers are
Figure 2 illustrates the forward propagation process of the
added, one is the transform gate and the other is the carry
GCN network used in our paper, where a large number of
gate.
hidden layers are inserted between the input and output
13
12612 B. Zhu et al.
( ( ))
3.2.2 Relation embedding
henew = concat hes , matadd HLr , HRr , (4)
GCNs are able to capture the structural information of
where concat(⋅) means concatenatation; matadd(⋅) means
the knowledge graph to obtain the representation of each
matrix addition.
entity, however, they cannot directly obtain the relation
embeddings. For a relation r, there are many head entities
and tail entities in the knowledge graph of relation r. The 3.3 Attribute embedding
head entities and tail entities of a relation r can provide its
semantic. To compute the relation embedding and further Attribute embedding is also learned by GCNs. Entity embed-
optimize entity representation by combining with original ding and attribute embedding are trained separately, so we
entity embedding, we approximate the representation of set two different feature vectors for structure and attribute,
the relation by using the average embedding of the head respectively. Let H(l) denote the attribute representation in
and tail entities to which the relation is connected based a
lth layer, and the convolutional is computed in the same way
on HGCN. The head and tail entity representations of each as entity embedding:
relation can be learned from GCNs to assist entity align-
( )
ment. The vector representation of relation r is described H(l+1) = ReLU LH(l) Wa(l) , (5)
a a
as:
�∑ ∑ � where L is a combination Laplace; Wa(l) ∈ d(l) × d(l+1) is a
m∈Hh hm n∈Ht hn layer-specific trainable weight matrix of the lth layer, d(l+1)
r = concat � �, , (2)
�
card Hh card(Ht ) is the number of features in the (l + 1)th layer.
We get the final entity representation hefinal by combining
where concat(⋅) is a function to concatenate vectors; card(⋅) joint entity representations henew described in the last section
means the number of elements in the set. Hh and Ht represent with attribute representations ha . hefinal can be calculated as:
the set of head and tail entities connected by the relation r, ( )
respectively. hm and hn represent the vector representations hefinal = concat 𝛽henew , (1 − 𝛽)ha (6)
of the head and tail entities, respectively.
Then, a transformation matrix Wr is applied to consider where concat(⋅) means concatenation; 𝛽 is an adjustable
relation semantics included in relation triples. parameter to balance the importance of joint embeddings
( ) and attribute embeddings.
r = matmul r� , Wr , (3)
13
Cross‑lingual knowledge graph entity alignment by aggregating extensive structures and… 12613
13
12614 B. Zhu et al.
Fig. 3 The effect of different seed proportions on Hits@1 Fig. 4 The effect of information balance parameter on Hits@1
13
Cross‑lingual knowledge graph entity alignment by aggregating extensive structures and… 12615
References
Bizer C, Lehmann J, Kobilarov G et al (2009) Dbpedia - a crystalliza-
tion point for the web of data. J Web Semant 7:154–165. https://
doi.org/10.1016/j.websem.2009.07.002
Bordes A, Usunier N, García-Durán A, et al (December 2013) Trans-
lating embeddings for modeling multi-relational data. Paper pre-
sented at the 27th Advances in Neural Information Processing
Systems, Lake Tahoe, Nevada, United States, 5–8
Chen M, Tian Y, Yang M, et al (August 2017) Multilingual knowledge
graph embeddings for cross-lingual knowledge alignment. Paper
presented at the 26th International Joint Conference on Artificial
Intelligence, Melbourne, Australia, 19–25
Dettmers T, Minervini P, Stenetorp P, et al (February 2018) Convo-
lutional 2d knowledge graph embeddings. Paper presented at the
Fig. 8 The effect of GCN layers on Hits@10 32nd Association for the Advance of Artificial Intelligence, New
Orleans, Louisiana, USA, 2–7
Guo H, Tang J, Zeng W et al (2021) Multi-modal entity alignment in
hyperbolic space. Neurocomputing 461:598–607. https://doi.org/
begin to decline. This is because the more layers of GCN, 10.1016/j.neucom.2021.03.132
the more neighborhood information it contains, which will Hao Y, Zhang Y, He S, et al (September 2016) A joint embedding
have a bad impact on embedding vector computing. method for entity alignment of knowledge bases. Paper presented
at the 1st Knowledge Graph and Semantic Computing, Beijing,
China, 19–22
Jiang S, Nie T, Shen D, et al (September 2021) Entity alignment of
5 Conclusion knowledge graph by joint graph attention and translation repre-
sentation. Paper presented at the 18th International Conference,
In this paper, a new framework SSEA for cross-lingual KG Kaifeng, China, 24–26
Jiang T, Bu C, Zhu Y, et al (August 2019) Two-stage entity alignment:
entity alignment is proposed, which combines extensive Combining hybrid knowledge graph embedding with similarity-
structures and specific semantics together to improve entity based relation alignment. Paper presented at the 16th Pacific Rim
alignment. We first employ GCNs to capture global topo- International Conference on Artificial Intelligence, Cuvu, Yanuca
logical structure representations of KG. Then we approxi- Island, Fiji, 26–30
Kearnes SM, McCloskey K, Berndl M et al (2016) Molecular graph
mate relation representations based on the embeddings of convolutions: moving beyond fingerprints. J Comput Aided Mol
head entity and tail entity. To make full use of attribute Des 30:595–608
semantics to boost the alignment performance, we also use Kipf TN, Welling M (April 2017) Semi-supervised classification with
GCNs to get attribute representations in an easy but effec- graph convolutional networks. Paper presented at the 5th Interna-
tional Conference on Learning Representations, Toulon, France,
tive way. Entity embeddings and relation embeddings are 24–26
trained together, and attribute embeddings are independent Lu G, Zhang L, Jin M et al (2021) Entity alignment via knowledge
of joint entity and relation embedding. The results of struc- embedding and type matching constraints for knowledge graph
ture, relation, and attribute representations are concatenated, inference. J Ambient Intell Humaniz Comput 4:1–11
Pang N, Zeng W, Tang J, et al (June 2019) Iterative entity alignment
and the final entity representations after concatenating are with improved neural attribute embedding. Paper presented at the
robust and accurate. We only need pre-aligned entity pairs 16th Extended Semantic Web Conference, Portoroz, Slovenia, 2
13
12616 B. Zhu et al.
Song X, Zhang H, Bai L (August 2021) Entity alignment between Wu Y, Liu X, Feng Y, et al (August 2019) Relation-aware entity align-
knowledge graphs using entity type matching. Paper presented ment for heterogeneous knowledge graphs. Paper presented at
at the 14th Knowledge Science, Engineering and Management, the 28th International Joint Conference on Artificial Intelligence,
Tokyo, Japan, 14–16 Macao, China, 10–16
Sun J, Zhou Y, Zong C (December 2020) Dual attention network for Wu Y, Liu X, Feng Y, et al (November 2019) Jointly learning entity
cross-lingual entity alignment. Paper presented at the 28th Inter- and relation representations for entity alignment. Paper presented
national Conference on Computational Linguistics,Barcelona, at the 9th International Joint Conference on Natural Language
Spain (Online), 8–13 Processing, Hong Kong, China, 3–7
Sun Z, Hu W, Li C (October 2017) Cross-lingual entity alignment via Xu K, Song L, Feng Y, et al (February 2020) Coordinated reasoning
joint attribute-preserving embedding. Paper presented at the 16th for cross-lingual knowledge graph alignment. Paper presented
International Semantic Web Conference, Vienna, Austria, 21–25 at Innovative Applications of Artificial Intelligence Conference,
Sun Z, Hu W, Zhang Q, et al (July 2018) Bootstrapping entity align- New York, USA, 7–12
ment with knowledge graph embedding. Paper presented at the Zhu Q, Zhou X, Wu J, et al (August 2019) Neighborhood-aware atten-
27th International Joint Conference on Artificial Intelligence, tional representation for multilingual knowledge graphs. Paper
Stockholm, Sweden, 13–19 presented at the 28th International Joint Conference on Artificial
Trouillon T, Welbl J, Riedel S, et al (June 2016) Complex embeddings Intelligence, Macao, China, 10–16
for simple link prediction. Paper presented at the 33nd Interna-
tional Conference on Machine Learning, ICML 2016, New York Publisher's Note Springer Nature remains neutral with regard to
City, USA, 19–24 jurisdictional claims in published maps and institutional affiliations.
Wang Z, Lv Q, Lan X, et al (2018) Cross-lingual knowledge graph
alignment via graph convolutional networks. Paper presented at
the 2018 Conference on Empirical Methods in Natural Language
Processing, Brussels, Belgium, 31 October–4 November, 2018
13