EA_DRDO

Entity alignment
Hiren M
Indian Institute of Science

Bangalore
May 3, 2023
Hiren M Entity alignment 1/22

Outline
▶ Why do we need entity alignment?

▶ Examples
▶ What is Entity Alignment(Formal Definition)?
▶ Datasets
▶ Problem 1
▶ CG-Mu-Align1
▶ Problem 2
▶ MultiKE2
▶ SEU3
▶ Limitations
▶ What are we working on?
1
https://dl.acm.org/doi/abs/10.1145/3366423.3380289
2
https://www.ijcai.org/proceedings/2019/0754.pdf
3
https://aclanthology.org/2021.emnlp-main.226/
Why?
▶ Generally, most KGs are extracted from separate sources

independently. Different sources might not contain the same
information.
▶ As a result, the ”coverage of knowledge” is limited.
▶ Entity alignment is a problem which tries to identify
corresponding entities from 2 different KGs and align them.
▶ A wide range of problems can be solved by aligning multiple
KGs. Including, filling knowledge gaps, building cross-lingual
applications etc.

Examples
Figure 1: English Wiki vs Chinese

Wiki Figure 2: IMDB vs Freebase

What is Entity Alignment?
▶ Given two KGs, 𝐺 = (𝒱, ℰ, 𝒯, ℛ) & 𝐺′ = (𝒱′ , ℰ′ , 𝒯, ℛ) entity

alignment tries to find entity pairs {(𝑣𝑖 , 𝑣𝑖′ ) ∈ 𝒱𝑋𝒱′ } with
some metric in mind.(Generally hit@k)
▶ Where, 𝒱 : Set of entities
▶ ℰ : Set of edges
▶ 𝒯 : Types of nodes
▶ ℛ : Types of relations

Datasets
▶ Currently there are 3 main types of benchmark datasets for

EA
▶ Amazon-Wiki Music Dataset
▶ DBPedia
▶ DBPedia-Wikipedia
▶ DBPedia-Yago
▶ DBPedia Cross Lingual
▶ English - Chinese
▶ English - Japanese
▶ English - French

Problem 1
▶ Question : Why can’t we

just use an R-GCN to find
the embeddings and match
the ones with high
similarity?
▶ Similar embeddings will
be produced by a vanilla
GCN model for an entity
pair if and only if both
KGs have reasonably
complete information
about the entity.
▶ However, in practise, KGs
are sparse, and this leads
to distinct embeddings for
the same entities across
two KGs.

CG-Mu-Align
▶ At each layer, the information is transformed as:
𝑘
𝑧𝑖,𝑗 = 𝑊𝑟𝑘 ℎ𝑘−1
𝑗 , 𝑗 ∈ 𝒩𝑖,𝑟
where 𝒩𝑖,𝑟 is the nodes in the neighbourhood of 𝑖 that are

related using relation 𝑟.
▶ Now we calculate two different attentions namely 𝛼(node level
attention) and 𝛽(edge level attention). Now the information
is AGGREGATED as:
𝑘
𝑧𝑖𝑘 = ∑ 𝛼𝑖𝑗 𝛽𝑖𝑗 𝑧𝑖,𝑗
𝑁𝑖,𝑟
▶ And then, this embedding is combined as:

𝑘
ℎ𝑘𝑖 = 𝜎([𝑊𝑠𝑒𝑙𝑓 ℎ𝑘−1
𝑖 ||𝑧𝑖𝑘 ]
where || is concatenation operation.

CG-Mu-Align
▶ The node level attention is calculated as:
∑𝑞∈𝒩 𝑒−∥𝑧𝑝 −𝑧𝑞 ∥ ∑𝑝∈𝒩 𝑒−∥𝑧𝑝 −𝑧𝑞 ∥

𝛼𝑝 = 𝑖
, 𝛼𝑞 = 𝑖
∑𝑣∈𝒩
̂
∑𝑣′̂ ∈𝒩′ 𝑒−∥𝑧𝑣̂ −𝑧𝑣̂′ ∥ ∑𝑣∈𝒩
̂
∑𝑣′̂ ∈𝒩′ 𝑒−∥𝑧𝑣̂ −𝑧𝑣̂′ ∥
𝑣 𝑣 𝑣 𝑣
▶ The edge level attention is calculated as:
exp(𝜎(𝑎𝑟 [𝑧𝑖 ||𝑧𝑗 ]))

𝛽𝑖𝑗 =
∑𝑘∈𝒩 exp(𝜎(𝑎𝑟 [𝑧𝑖 ||𝑧𝑗 ]))
𝑖

CG-Mu-Align
▶ For each pair of aligned entities in (𝑖, 𝑖′ ) in the training data,

we sample N negative entities from 𝐾𝐺1 and 𝐾𝐺2 .
▶ Then final representation from two GNN is obtained from two
GNN encoders as (ℎ𝐾 𝐾
𝑖 , ℎ𝑖′ ) and apply the following loss:
ℒ = ∑ ∑ 𝑚𝑎𝑥 (0, 𝑑(ℎ𝐾 𝐾 𝐾 𝐾

𝑖 , ℎ𝑖′ ) − 𝑑(ℎ𝑖− , ℎ𝑖−′ ) + 𝛾)
(𝑖,𝑖′ ) (𝑖−,𝑖−′ )
where 𝑑(𝑎, 𝑏) = ‖𝑎 − 𝑏‖2

Problem 2(annotations)
▶ The method described above assumes that pivots(labels) are

available.
▶ However, manually annotating the entire dataset with millions
of nodes in not feasible.
▶ So we have been inclining towards unsupervised methods.

MultiKE
▶ Entities in KGs have various features, but the current

embedding-based entity alignment methods exploit just one or
two types of them. MultiKE utilizes these for Entity
Alignment.
▶ MultiKE doesn’t rely on annotations.

MultiKE
▶ MutliKE uses 3 different views and 3 different types of

embeddings for those views:
▶ Name View Embeddings
▶ Uses an autoencoder and pretrained word vectors to get name
embeddings.(ℎ(1) )
▶ Relation View Embeddings
▶ Adopts TransE(𝑓𝑟𝑒𝑙 (ℎ(2) , 𝑟, 𝑡(2) ) = −||ℎ(2) + 𝑟 − 𝑡(2) ||) to
compute relational features with negative sampling.
▶ Attribute View Embeddings
▶ Stacks attribute and value embeddings into a < 𝑎, 𝑣 >∈ ℝ2×𝑑
matrix and them using a CNN with kernel of size 2 × 𝑐(𝑐 < 𝑑).

Losses
▶ The method optimizes the following losses:

▶ Relational : ℒ(𝜃(2) ) =
∑(ℎ,𝑟,𝑡)∈𝒳+ ∪𝒳− 𝑙𝑜𝑔(1 + 𝒜[ℎ, 𝑟, 𝑡]𝑒𝑥𝑝(𝑓𝑟𝑒𝑙 (ℎ(2) , 𝑟, 𝑡(2) )))
▶ Attribute :
ℒ(𝜃(3) ) = ∑(ℎ,𝑟,𝑡)∈𝒴+ 𝑙𝑜𝑔(1 + 𝑒𝑥𝑝(||ℎ(3) − 𝐶𝑁 𝑁 (< 𝑎; 𝑣 >)||))
▶ Cross relational :
▶ Given a triplet (ℎ, 𝑟, 𝑡), if (𝑟, 𝑟)̂ constitutes a relation
alignment, then the cross relational embeddings can also be
optimized.
▶ First, a relation alignment set is constructed using name
embeddings of a the relation types. IE,
𝒮𝑟𝑒𝑙 = {(𝑟, 𝑟,̂ 𝑠𝑖𝑚(𝑟, 𝑟))|𝑠𝑖𝑚(𝑟,
̂ 𝑟)̂ > 𝜂}
▶ Now, ℒ𝐶𝑅𝐴 (𝜃(2) ) =
∑(ℎ,𝑟,𝑡)∈𝒳‴ 𝑠𝑖𝑚(𝑟, 𝑟)𝑙𝑜𝑔(1
̂ + 𝑒𝑥𝑝(𝑓𝑟𝑒𝑙 (ℎ(2) , 𝑟, 𝑡(2) )))
▶ Where 𝒳 is the set of relation facts having the relations in
‴
𝒮𝑟𝑒𝑙 .

Combine
▶ The authors propose 3 ways to concatenate the 3 view

features:
▶ Weighted View Averaging
▶ ℎ̃ = ∑𝐷 𝑤𝑖 ℎ(𝑖)
𝑖=1
̄
▶ 𝑤𝑖 = 𝐷𝑐𝑜𝑠(ℎ,ℎ)
̄ (𝑗)
, where ℎ̄ is the average of the multi-view
∑𝑗=1 𝑐𝑜𝑠(ℎ,ℎ
embeddings.
▶ Shared Space Learning
▶ Seeks to induce an orthogonal mapping matrix from each
view-specific embedding space to a shared space
2 2
▶ 𝐿(𝐻,̃ 𝑍) = ∑𝐷 ∥𝐻̃ − 𝐻 (𝑖) 𝑍 (𝑖) ∥ + ∥𝐼 − 𝑍 (𝑖)𝑇 𝑍 (𝑖) ∥
𝑖=0 𝐹 𝐹
▶ In-training Combination
▶ The goal is to maximize the similarity between the combined
embeddings and the view-specific embeddings in a unified
embedding space.
2
▶ 𝐿(𝐻,̃ 𝐻) = ∑𝐷 ∥𝐻̃ − 𝐻∥
𝑖=1 𝐹

SEU EA
▶ The authors claim that the source and target KGs are
structurally and textually isomorphic. The entity alignment
issue is reduced to an assignment problem and the Hungarian
Algorithm may be used to solve it.
▶ And the SEU just takes a few seconds to finish on publicly
available datasets while using GPU.

SEU EA(contd.)
▶ Assignment problem is a combinatorial optimization problem

which tries to maximize the total profit by finding the best
assignment plan. The optimization objective can be written as
follows:
arg max ⟨𝑃 , 𝑋⟩𝐹
𝑃 ∈ℙ𝑁

SEU EA(contd.)
▶ Assumptions
▶ 𝐴𝑠 and 𝐴𝑡 are isomorphic, ie 𝑃 𝐴𝑠 𝑃 −1 = 𝐴𝑡
▶ The translation system provides an accurate 1:1
correspondence between the textual characteristics of
comparable entity pairs, ie 𝑃 𝐻𝑠 = 𝐻𝑡
▶ Forming Objective
▶ From the above assumptions, we have
𝑙
(𝑃 𝐴𝑠 𝑃 −1 ) 𝑃 𝐻𝑠 = 𝐴𝑙𝑡 𝐻𝑡 ∀𝑙 ∈ ℕ
𝑃 𝐴𝑙𝑠 𝐻𝑠 = 𝐴𝑙𝑡 𝐻𝑡 ∀𝑙 ∈ ℕ
▶ Based on this, 𝑃 can be used to minimize the forbenius norm
𝐿 2
∑𝑙=0 ∥𝑃 𝐴𝑙𝑠 𝐻𝑠 − 𝐴𝑙𝑡 𝐻𝑡 ∥𝐹

SEU EA(contd.)
▶ Now the authors claim and prove that minimizing this is

equivalent to solving the following problem:
𝐿
𝑇
arg max ⟨𝑃 , ∑ 𝐴𝑙𝑡 𝐻𝑡 (𝐴𝑙𝑠 𝐻𝑠 ) ⟩
𝑃 ∈ℙ|𝐸| 𝑙=0 𝐹
▶ Now, this problem becomes a LP problem as follows
arg max ∑ 𝑃 [𝑖, 𝑗]𝑆[𝑖, 𝑗]

𝑃 ∈ℙ|𝐸| 𝑖,𝑗
P[i,j] = 0 or P[i,j] = 1 & �𝑗 𝑃 [𝑖, 𝑗] = 1, ∑𝑖 𝑃 [𝑖, 𝑗] = 1

Limitations?
▶ Most of these current state of the art methods use only the
pairwise relations between the entities.
▶ There can be nonlinear, high-order interactions involving
multiple nodes, edges, triplets or higher-order underlying
structures called a simplex.

What we are working on?
▶ Inducing simplices from a given knowledge graph

▶ Coming up with methods to learn simplice features from
entity features
▶ Coming up with methods to learn label independent entity
features that can be used for downstream tasks like Entity
Alignment, KGC, Drug repurposing, etc.

Thanks

EA_DRDO

Uploaded by

Copyright:

Available Formats

You might also like

EA_DRDO

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EA_DRDO

Uploaded by

Copyright:

Available Formats

Entity alignment

Indian Institute of Science

Hiren M Entity alignment 1/22

▶ Why do we need entity alignment?

▶ Generally, most KGs are extracted from separate sources

Hiren M Entity alignment 3/22

Figure 1: English Wiki vs Chinese

Hiren M Entity alignment 4/22

▶ Given two KGs, 𝐺 = (𝒱, ℰ, 𝒯, ℛ) & 𝐺′ = (𝒱′ , ℰ′ , 𝒯, ℛ) entity

Hiren M Entity alignment 5/22

▶ Currently there are 3 main types of benchmark datasets for

Hiren M Entity alignment 6/22

▶ Question : Why can’t we

Hiren M Entity alignment 7/22

where 𝒩𝑖,𝑟 is the nodes in the neighbourhood of 𝑖 that are

▶ And then, this embedding is combined as:

where || is concatenation operation.

▶ The node level attention is calculated as:

∑𝑞∈𝒩 𝑒−∥𝑧𝑝 −𝑧𝑞 ∥ ∑𝑝∈𝒩 𝑒−∥𝑧𝑝 −𝑧𝑞 ∥

▶ The edge level attention is calculated as:

exp(𝜎(𝑎𝑟 [𝑧𝑖 ||𝑧𝑗 ]))

Hiren M Entity alignment 9/22

▶ For each pair of aligned entities in (𝑖, 𝑖′ ) in the training data,

ℒ = ∑ ∑ 𝑚𝑎𝑥 (0, 𝑑(ℎ𝐾 𝐾 𝐾 𝐾

where 𝑑(𝑎, 𝑏) = ‖𝑎 − 𝑏‖2

Hiren M Entity alignment 10/22

▶ The method described above assumes that pivots(labels) are

Hiren M Entity alignment 11/22

▶ Entities in KGs have various features, but the current

Hiren M Entity alignment 12/22

▶ MutliKE uses 3 different views and 3 different types of

Hiren M Entity alignment 13/22

▶ The method optimizes the following losses:

Hiren M Entity alignment 14/22

▶ The authors propose 3 ways to concatenate the 3 view

Hiren M Entity alignment 15/22

Hiren M Entity alignment 16/22

▶ Assignment problem is a combinatorial optimization problem

Hiren M Entity alignment 17/22

Hiren M Entity alignment 18/22

▶ Now the authors claim and prove that minimizing this is

▶ Now, this problem becomes a LP problem as follows

arg max ∑ 𝑃 [𝑖, 𝑗]𝑆[𝑖, 𝑗]

P[i,j] = 0 or P[i,j] = 1 & �𝑗 𝑃 [𝑖, 𝑗] = 1, ∑𝑖 𝑃 [𝑖, 𝑗] = 1

Hiren M Entity alignment 19/22

Hiren M Entity alignment 20/22

▶ Inducing simplices from a given knowledge graph

Hiren M Entity alignment 21/22

Hiren M Entity alignment 22/22

You might also like