Knowledge Enhanced Semantic Communication

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

1

Knowledge Enhanced Semantic Communication


Receiver
Bingyan Wang, Rongpeng Li, Jianhang Zhu, Zhifeng Zhao, and Honggang Zhang

Abstract—In recent years, with the rapid development of deep the DL-based transmitter and receiver, and have proven their
learning and natural language processing technologies, semantic superiority over traditional communication methods. However,
communication has become a topic of great interest in the field of the receiver is still lacking the comprehensive knowledge
communication. Although existing deep learning-based semantic
arXiv:2302.07727v2 [cs.CL] 15 Apr 2023

communication approaches have shown many advantages, they understanding and reasoning ability, and cannot make full use
still do not make sufficient use of prior knowledge. Moreover, of the implicit prior knowledge in complex sentences.
most existing semantic communication methods focus on the In order to improve the capability of knowledge under-
semantic encoding at the transmitter side, while we believe that standing and reasoning, some studies propose to introduce the
the semantic decoding capability of the receiver should also be knowledge graph (KG), which stores human knowledge with a
concerned. In this paper, we propose a knowledge enhanced
semantic communication framework in which the receiver can graph structure composed of entities and relationships [6], into
more actively utilize the facts in the knowledge base for semantic semantic communication. In KGs, each fact is abstracted into
reasoning and decoding, on the basis of only affecting the a triple in the form of (entity-relationship-entity). For example,
parameters rather than the structure of the neural networks Ref. [7] utilizes knowledge triples to represent the semantic
at the transmitter side. Specifically, we design a transformer- information and evaluates the importance of each triple by
based knowledge extractor to find relevant factual triples for
the received noisy signal. Extensive simulation results on the an attention policy gradient algorithm. Ref. [8] proposes a
WebNLG dataset demonstrate that the proposed receiver yields semantic communication framework by encoding texts into
superior performance on top of the knowledge graph enhanced KGs. Ref. [9] introduces a knowledge reasoning based seman-
decoding. tic communication system. In Ref. [10], a reliable semantic
Index Terms—Semantic communication, knowledge graph, communication system based on KG is proposed, which can
Transformer. adaptively adjust the transmitted triples according to channel
quality. In Ref. [11], the authors exploit the knowledge base
by leveraging a logic programming language. In Ref. [12],
I. I NTRODUCTION
the authors propose a semantic similarity-based approach to
Benefiting from the rapid development of deep learning automatically identify and extract the most common concepts
(DL) and natural language processing (NLP), semantic com- from the knowledge base.
munications emerge with a special emphasis on the successful Knowledge graphs have somewhat improved the capabil-
delivery of the semantics of a message, rather than the ity of semantic communication systems to handle common
conventional bit-level accuracy in traditional communication. knowledge. However, most existing works only consider opti-
There have been some interesting studies on semantic commu- mizing the transmitter while ignoring the receiver. Typically,
nication [1]–[5]. Among them, one of the popular paradigms their transmitters achieve the semantic encoding by capturing
belongs to the DL-based joint source-channel coding (JSCC). and embedding the factual triples from the sentences with
For example, Ref. [1] proposes a transformer-based semantic knowledge graphs. Nonetheless, it is a great challenge for a
communication system for text transmission. Ref. [2] intro- knowledge base to cover all the semantic information of a
duces a semantic communication system based on Universal sentence, and the information missing may be detrimental to
Transformer (UT) with an adaptive circulation mechanism. In the communication efficiency. For example, a sentence like
order to reduce the semantic transmission error, Ref. [3] ex- “She loves him” can’t be represented by any factual triples
ploits hybrid automatic repeat request (HARQ), while Ref. [4] in the knowledge base, but it might be also the vital element
introduces an adaptive bit rate control mechanism. Moreover, in a transmitted text, leading the failure unacceptable. Instead,
Ref. [5] proposes a masked autoencoder (MAE) based sys- knowledge graphs can only describe the semantics in those
tem to robustly combat the possible noise. Notably, a key simple declarative sentences. Therefore, sending messages that
assumption of these studies lies in that both transmitter and are only encoded by knowledge graph-based triples may cause
receiver share common knowledge. On top of this assumption, extra semantic loss.
the existing semantic communication methods jointly train Therefore, in order to address these issues, we propose a
novel receiver-side scheme for semantic communication based
B. Wang, R. Li and J. Zhu are with the College of Information Science
and Electronic Engineering, Zhejiang University, Hangzhou 310027, China on KG. Different from existing works that extract factual
(e-mail: {wangbingyan, lirongpeng, zhujh20}@zju.edu.cn). triples from the transmitter side as the semantic representa-
Z. Zhao and H. Zhang are with Zhejiang Lab, Hangzhou, China as well tions, we apply a knowledge extraction module at the receiver
as the College of Information Science and Electronic Engineering, Zhe-
jiang University, Hangzhou 310027, China (e-mail: zhaozf@zhejianglab.com, side as a semantic decoding assistant to avoid the injection
honggangzhang@zju.edu.cn). of extra semantic noise and enhance the model’s robustness
2

TABLE I
N OTATIONS USED IN THIS PAPER Transmitter Receiver

Semantic Encoder S  () Semantic Decoder S−1 ()


Notation Definition s = [ s1 ,L , sN ] s$ = [ s$ 1 ,L , s$ N ]
Sβ (·), Sγ−1 (·) Semantic encoder and decoder Embedding Output

Cα (·), Cδ−1 (·) Channel encoder and decoder


Kθ (·) Knowledge extractor Transformer Transformer Knowledge Knowledge
Encoder Decoder Extractor Base
s, ŝ Input and decoded sentence
h Semantically encoded vector
ĥ Channel decoded vector
x Transmitted signal Dense Layer Dense Layer
y Received signal Channel Physical
Channel
Encoder C () Channel
k Knowledge vector Decoder C−1 ()
t Index vector of extracted triples
nt Number of triples in the knowledge base
N Length of sentence Fig. 1. The framework of proposed knowledge graph enhanced semantic
w Weight parameter for knowledge extraction communication system.
fk (·) Knowledge embedding process
where x ∈ CN ×c is the channel vector for transmission, c is
the number of symbols for each token. Given that y ∈ CN ×c
in low SNR environment, making semantic communications is the vector of received symbols after transmitting x over the
more effective. By doing so, the knowledge in the knowledge physical channel, y can be formulated as
base can be utilized for decoding on the basis of only affecting
the parameters rather than the structure of the deep neural y = Hx + n, (3)
networks (DNN) at the transmitter side. Unlike the transmitter-
side schemes [8]–[10], as the received content is inevitably where H denotes the channel matrix and n ∼ N (0, σ 2 I) is
polluted by noise, it remains essential to accurately extract the additive white Gaussian noise (AWGN).
the factual triples from noisy content before leveraging them After receiving y, the receiver first decodes the content
to complement the decoding procedure. Therefore, rather than from symbols with the channel decoder Cδ−1 (·) and gets the
focusing on each word in a sentence, it is better to consider decoded vector ĥ ∈ RN ×ds ,
extracting the semantic knowledge representation of the whole
sentence from a novel perspective. For this purpose, we utilize ĥ = Cδ−1 (y). (4)
transformer encoders to get the implicit semantic representa-
tion of a sentence. By integrating KG and knowledge extractor Notably, semantic communications implicitly rely on some
into the conventional semantic decoder, the receiver can extract prior knowledge between the transmitter and receiver for the
knowledge from noisy messages and enhance the decoding joint training process. However, different from such prior
capability. knowledge, the knowledge base in our model refers to some
The remainder of the paper is organized as follows. The factual triples and can be located at the receiver side. To exploit
system model and problem formulation are given in Section the knowledge base, a knowledge extractor is further applied
II. Section III describes the DNN structure of a knowledge at the receiver side to extract and integrate relevant knowledge
enhanced semantic receiver. Section IV discusses the simula- from the received signal to yield the aggregated knowledge k.
tion settings and experimental results. Section V concludes the In particular, the knowledge extracting and aggregating process
paper. can be formulated as

k = Kθ (ĥ), (5)
II. S YSTEM M ODEL AND P ROBLEM F ORMULATION
where Kθ (·) represents the knowledge extractor.
A semantic communication system generally encompasses a Then the knowledge enhanced semantic decoder Sγ−1 (·)
semantic encoder and decoder, which can be depicted in Fig. 1. leverages the channel decoded vector ĥ and the extracted
Without loss of generality, we denote the input sentence s = knowledge vector k to obtain the received message ŝ =
[s1 , s2 , ..., sN ] ∈ NN , where si represents the i-th word (i.e., [ŝ1 , ŝ2 , ..., ŝN ]
token) in the sentence. In particular, the transmitter consists
of two modules, that is, the semantic encoder and the channel ŝ = Sγ−1 (ĥ || k), (6)
encoder. The semantic encoder Sβ (·) extracts the semantic where Sγ−1 (·) stands for the knowledge enhanced semantic
information in the content and represents it as a vector h ∈ decoder, and || indicates a concatenation operator.
RN ×ds , where ds is the dimension of each semantic symbol.
The accuracy of semantic communication is determined by
Mathematically,
the semantic similarity between the sent and received contents.
h = Sβ (s), (1) In order to minimize the semantic errors between s and ŝ, the
loss function can take account of the cross entropy of the two
and then the channel encoder Cα (·) encodes h into symbols
vectors
that can be transmitted over the physical channel as N
X
Lmodel = − (q(si ) log p(ŝi )), (7)
x = Cα (h), (2) i=1
3

where Lmodel is the loss function, q(si ) is the one-hot repre-


sentation of si ∈ s, and p(ŝi ) is the predicted probability of “Beef is an ingredient of Batchoy which comes from the Philippines.”
Semantic Decoder
the i-th word.
Instead of using traditional communication modules for $
Channel Decoded Vector h Knowledge Vector k
physical-layer transmission, most existing studies have chosen Concatenation

to utilize end-to-end DNNs to accomplish the whole communi- Encoding Classfication & Embedding Knowledge
cation process, as shown in Fig. 1. The semantic encoders and Embedding
0

Multi-Head Atteneion
decoders are typically based on transformers [13]. Meanwhile,

Channel Decoder
1 (<h>Batchoy,

Knowledge Base
Feed Forward
Add & Norm

Add & Norm


the channel encoding and decoding part can be viewed as 0 <r>Ingredient,

Sigmoid
z (1) z( L)

Linear
y L L L <t>Beef)
an autoencoder implemented by fully connected layers. The 0 (<h>Batchoy,
<r>Country,
1
whole semantic communication process is then reformulated 0
<t>Philippines)

as a sequence-to-sequence problem. Based on these DNNs,


Extracted
Indices t
in this paper, we primarily focus on developing appropriate Triples
L  Encoder Layers
implementation of the knowledge extractor in (5) and the Phase 1 Phase 2
knowledge-enhanced decoder in (6) to minimize the model
loss function. Fig. 2. The knowledge graph enhanced semantic decoder.

After L layers of encoding, the embedding representation


III. A K NOWLEDGE E XTRACTOR M ODEL BASED
z(L) of the channel decoded vector ĥ is obtained. Then a
S EMANTIC D ECODER
multi-label classifier is adopted to compute a indicator vector
A. The Design of the Knowledge Extractor t of the triples associated with the representation
In this part, we discuss the implementation of knowledge t = sigmoid(z(L) Wt + bt ), (11)
extractor enhanced semantic decoding task. The whole knowl-
nt
edge extraction process, as shown in Fig. 2, can be divided where t = [t̂1 , · · · , t̂nt ] ∈ R , t̂i ∈ [0, 1] for all i = 1 · · · nt .
into two phases. The first phase executes an embedding task nt denotes the number of triples in the knowledge base, Wt
to obtain a representation of the decoded vector with the and bt are parameters of the classifier. If t̂i ≥ 0.5, the triple
transformer encoders. In the second phase, we try to find all mi corresponding to index i is predicted to be relevant to the
corresponding triples of the representation with a multi-label received content.
classifier, and then embed the triples into a compressed format Ultimately, the obtained relevant factual triples {mi } pre-
to assist the final decoding. dicted by the model are embedded into a vector k =
In particular, in order to extract the semantic representa- fk ({mi }), where the embedding process is abstractly repre-
tion, we adopt a model composed of a stack of L identi- sented as fk (·). In particular, rather than compute the embed-
cal transformer encoders, each of them consisting a multi- ding of the entity and relationship separately, we choose to
head attention mechanism, as well as some feed-forward and integrate the triples into one compressed format. Subsequently,
normalization sublayers [13]. In particular, without loss of as in (6), the knowledge vector is concatenated with the
generality, assuming that z(l−1) is the output of the (l − 1)- decoding vector and fed into the semantic decoder.
th encoder layer, where z(0) is equivalent to ĥ for the input
layer, the self-attention mechanism of the l-th layer could be B. The Training Methodology
represented as In order to train the knowledge extractor Kθ (·), a complete
Q(l) K(l)T (l) semantic communication model is first required. During the
Attention(z(l−1) ) = softmax( √ )V , (8) training, the sentences are first sent through the transmitter via
dk
the channel and then the receiver decoded vector is fed into
(l) (l)
where Q(l) = z(l−1) WQ , K(l) = z(l−1) WK , V(l) = the knowledge extractor. Afterwards, the knowledge extractor
(l) (l) (l) (l)
z(l−1) WV . WQ , WK and WV are the projection matrices is trained by gradient descent with the frozen parameters of
of the l-th layer, and dk is the dimension of model. Further- the transmitter. Since the number of negative labels are much
more, z(l−1) is added to the calculated result Attention(z(l−1) ) more than that of the positive labels in the classification, the
via a normalized residual connection, that is, weighted Binary Cross Entropy (BCE) is utilized as the loss
function, which could be represented as
a(l) = LayerNorm(Attention(z(l−1) ) + z(l−1) ), (9) nt
X
Lknowledge = −wi [ti · log t̂i + (1 − ti ) · log(1 − t̂i )], (12)
where a(l) is the output, LayerNorm(·) denotes a layer nor-
i=1
malization operation. Afterwards, a feed-forward network is
(l) (l) (l) (l)
involved as FFN(a(l) ) = max(0, a(l) WF1 + bF1 )WF2 + bF2 , where ti ∈ {0, 1} represents the training label, and t̂i is the
(l) (l) (l) (l)
where WF1 , WF2 , bF1 and bF2 are parameters in the feed- prediction output in (11). wi is the weight of i-th position,
forward layer of the l-th encoder block. Next, we adopt a related to the hyperparameter w. For ti = 0, wi = w;
residual connection and a layer normalization otherwise wi = 1 − w. Increasing w can result in a more
sensitive extractor, but it also brings an increase in the false
z(l) = LayerNorm(FFN(a(l) ) + a(l) ). (10) positive rate.
4

Algorithm 1 The Semantic Communication Process with TABLE II


Knowledge Graph Enhanced Receiver E XPERIMENTAL S ETTINGS

1: Require: models Sβ (·), Cα (·), Sγ−1 (·), Cδ−1 (·) and Kθ (·) Parameter value
2: Input: tokenized sentence s
Train dataset size 24,467
3: Output: decoded sentence ŝ Test dataset size 2,734
4: Transmitter: Weight parameter w 0.02
5: Semantic encoding: h ← (Sβ (s)) DNN Optimizer Adam
Batch size 32
6: Channel encoding: x ← (Cα (h)) Model dimension 128
7: Transmit x over the physical channel: y ← Hx + n Learning rate 10−4
8: Receiver: Channel vector dimension 16
The number of multi-heads 8
9: Channel decoding: ĥ ← Cδ−1 (y)
10: Knowledge extraction Kθ (·):
11: Compute the embedding representation z(L) B. Numerical Results
12: t ← sigmoid(z(L) Wt + bt ) Fig. 3 and Fig. 4 show the BLEU score and Sentence-Bert
13: Find the triples {mi } where t̂i ≥ 0.5 score of the transformer model with respect to the signal-to-
14: Knowledge embedding: k ← fk ({mi }) noise ratio (SNR), respectively. It can be observed that the
15: Semantic decoding: ŝ ← Sγ−1 (ĥ || k) assistance of the knowledge extractor could significantly con-
tribute to improving the performance. In particular, regardless
The training complexity of a knowledge extractor is of the channel type, the knowledge extractor can always bring
O(LN 2 · dk ), which is the same order as the transformer more than 5% improvement in BLEU under low SNR sce-
encoder. Notably, the knowledge extractor is not limited to the narios. For the Sentence-Bert score, the knowledge-enhanced
conventional transformer structure, but can also be applied to receiver also shows a similar performance improvement. This
different transformer variants, such as Universal Transformer demonstrates that the proposed scheme can improve the com-
(UT) [2]. With the self-attention mechanism, the extracted prehension of semantics at the receiver side. Fig. 5 and Fig. 6
factual triples can provide additional prior knowledge to the demonstrate the performance of the UT model under both the
semantic decoder and therefore improve the performance of BLEU and Sentence-Bert metric, and a similar performance
the decoder. Typically, the knowledge vector is concatenated improvement could also be observed.
to the received message, rather than being merged into the On the other hand, we test the performance of the knowl-
source signal as previous works suggested. This architecture edge extractor under different SNRs. As shown in Fig. 7, the
ensures that when the extractor is of little avail, it can still extractor model can obtain a recall rate of over 90%. However,
function as a standard encoder-decoder transformer structure, the received content may be polluted by noise, resulting in
while avoiding possible semantic losses introduced by the an increase in false positives and leading to a large gap
knowledge extraction procedure. Therefore, even if the knowl- between precision and recall. The number of encoder layers
edge extractor fails to find any relevant knowledge, the model in the knowledge extractor may also affect the performance
still performs comparably to the baseline. of the model. Therefore, we also implement the knowledge
extractor with different number of transformer encoder layers
and present the performance comparison in Fig. 8. It can
IV. N UMERICAL R ESULTS be observed that the 6-layer model performs slightly better
than the 3-layer model. However, the performance remains
A. Dataset and Parameter Settings almost unchanged when it further to 9 layers. Furthermore,
in addition to utilize a fixed model trained at certain SNR,
The dataset used in the numerical experiment is based on it is also possible to leverage several SNR-specific models,
WebNLG v3.0 [14], which consists of data-text pairs where each corresponding to a specific SNR. Table III demonstrates
the data is a set of triples extracted from DBpedia and the the performance comparison between 0dB-specific and fixed
text is the verbalization of these triples. In this numerical model. It can be observed that compared to the fixed model,
experiment, the weight parameter w is set to 0.02, while the the SNR-specific model could yield superior performance
learning rate is set to 10−4 . Moreover, we set the dimension improvements. As a comparison, we also implement a scheme
of the dense layer as 128 × 16, and adopt an 8-head attention that utilize knowledge extractor for semantic encoding at
in transformer layer. The detailed settings of the proposed the transmitter. Fig. 9 presents the corresponding simulation
system are shown in Table II. We train the models based on results, and it can be observed that this transmitter-based
both the conventional transformer and UT [2]. Besides, we scheme is significantly inferior than the proposed scheme.
adopt two metrics to evaluate their performance, that is, 1-
gram Bilingual Evaluation Understudy (BLEU) [15] score for
measuring word-level accuracy and Sentence-Bert [16] score V. C ONCLUSION
for measuring semantic similarity. Notably, Sentence-Bert is a In this paper, we have proposed a knowledge graph en-
Siamese Bert-network model that generates fixed-length vector hanced semantic communication framework in which the
representations for sentences, while the Sentence-Bert score receiver can utilize prior knowledge from the knowledge
computes the cosine similarity of embedded vectors. graph for semantic decoding while requiring no additional
5

1.0 1.0 1.0

Sentence-Bert Similarity Score


0.9 0.9 0.9
1-gram BLEU Score

1-gram BLEU Score


0.8 0.8 0.8
0.7 AWGN 0.7 AWGN 0.7 AWGN
Fading Fading Fading
0.6 AWGN, with nowledge 0.6 AWGN, with knowledge 0.6 AWGN, with nowledge
Fading, with nowledge Fading, with knowledge Fading, with nowledge
0.5 0.5 0.5
−2.5 0.0 2.5 5.0 7.5 −2.5 0.0 2.5 5.0 7.5 −2.5 0.0 2.5 5.0 7.5
SNR/dB SNR/dB SNR/dB

Fig. 3. The transformer model performance Fig. 4. The transformer model performance Fig. 5. The Universal Transformer model
of BLEU score with respect to SNR. of Sentence-Bert score with respect to SNR. performance of BLEU score with respect to SNR.

1.0 1.0
Sentence-Bert Similarity Score

0.9 0.9
0.8

1-gram BLEU Score


0.8 0.8

0.7 0.6 0.7


AWGN Without k owledge
Fading 3-layer extractor
0.6 AWGN, with knowledge Precision 0.6 6-layer extractor
0.4
Fading, with knowledge Recall 9-layer extractor
0.5 0.5
−2.5 0.0 2.5 5.0 7.5 −2.5 0.0 2.5 5.0 7.5 −2.5 0.0 2.5 5.0 7.5
SNR/dB SNR/dB SNR/dB

Fig. 6. The Universal Transformer model Fig. 7. The performance of knowledge extractor. Fig. 8. The performance comparison under differ-
performance of Sentence-Bert score with respect ent numbers of encoder layers for extractor.
to SNR.

TABLE III
T HE PERFORMANCE COMPARISON BETWEEN A FIXED EXTRACTOR MODEL 2675, 2021.
AND SNR- SPECIFIC MODELS . [2] Q. Zhou, R. Li et al., “Semantic communication with adaptive universal
SNR/dB Fixed SNR-specific transformer,” IEEE Wireless Commun. Lett., vol. 11, no. 3, pp. 453–457,
2021.
-4 0.6514 0.6718 [3] P. Jiang, C.-K. Wen et al., “Deep source-channel coding for sentence
-2 0.7936 0.8126 semantic transmission with HARQ,” IEEE Trans. Commun., vol. 70,
0 0.8661 0.8661 no. 8, pp. 5225–5240, 2022.
2 0.9025 0.9134 [4] Q. Zhou, R. Li et al., “Adaptive bit rate control in semantic communica-
4 0.9164 0.9201 tion with incremental knowledge-based HARQ,” IEEE Open J. Commun.
Soc., vol. 3, pp. 1076–1089, 2022.
1.0 [5] Q. Hu, G. Zhang et al., “Robust semantic communications against
Sentence-Bert Similarity Score

semantic noise,” in Proc. VTC2022-FALL, 2022.


0.9 [6] S. Ji, S. Pan et al., “A survey on knowledge graphs: Representation,
acquisition, and applications,” IEEE Trans. Neural Netw. Learn. Syst.,
0.8 vol. 33, no. 2, pp. 494–514, 2022.
[7] Y. Wang, M. Chen et al., “Performance optimization for semantic
0.7 communications: An attention-based learning approach,” in Proc. IEEE
0.6 Global Commun. Conf. (GLOBECOM), Madrid, Spain, Dec. 2021.
Tran mitter, with knowledge [8] F. Zhou, Y. Li et al., “Cognitive semantic communication systems driven
Receiver, with knowledge
0.5 by knowledge graph,” in Proc. ICC, Seoul, South Korea, May 2022.
−2.5 0.0 2.5 5.0 7.5 [9] J. Liang, Y. Xiao et al., “Life-long learning for reasoning-based semantic
SNR/dB
communication,” in Proc. ICC Workshops, Seoul, South Korea, May
2022.
Fig. 9. The performance comparison of the proposed scheme with a [10] S. Jiang, Y. Liu et al., “Reliable semantic communication system enabled
transmitter-based scheme. by knowledge graph,” Entropy, vol. 24, no. 6, p. 846, 2022.
[11] J. Choi, S. W. Loke, and J. Park, “A unified approach to semantic
modifications to the transmitter architecture. Specifically, we information and communication based on probabilistic logic,” IEEE
have designed a knowledge extractor to find the factual triples Access, vol. 10, pp. 129 806–129 822, 2022.
associated with the received noisy sentences. Simulation re- [12] V. Muppavarapu, G. Ramesh et al., “Knowledge extraction using seman-
tic similarity of concepts from web of things knowledge bases,” Data
sults on the WebNLG dataset have shown that our proposed & Knowledge Engineering, vol. 135, p. 101923, 2021.
system is able to exploit the prior knowledge in the knowledge [13] A. Vaswani et al., “Attention is all you need,” in Proc. NeurIPS, Long
base much more deeply and obtain performance gains at Beach, USA, Dec. 2017.
[14] C. Gardent, A. Shimorina et al., “Creating training corpora for nlg
the receiver side. In the future, we will investigate the joint micro-planning,” in Proc. ACL, Vancouver, Canada, Jul./Aug. 2017.
optimization of both the transmitter and the receiver enhanced [15] K. Papineni, S. Roukos et al., “Bleu: A method for automatic evaluation
by the knowledge graph. of machine translation,” in Proceedings of the 40th annual meeting of
the Association for Computational Linguistics, 2002.
[16] N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using
R EFERENCES siamese bert-networks,” in Proc. EMNLP-IJCNLP, Hong Kong, China,
2019.
[1] H. Xie, Z. Qin et al., “Deep learning enabled semantic communication
systems,” IEEE Transactions on Signal Processing, vol. 69, pp. 2663–

You might also like