Professional Documents
Culture Documents
Abstractive Text Summary Generation With Knowledge Graph Representation
Abstractive Text Summary Generation With Knowledge Graph Representation
Abstract. With the enormous expansion of blogs, news stories, and re-
ports, the task of extracting usable information from this vast number of
textual documents is a laborious one. For summarising these documents,
automatic text summarization is an effective solution. The goal of text
summarization is to compress long documents into brief summaries while
retaining important information and meaning. Many interesting summa-
rization models have been proposed to handle the challenges, such as
saliency, fluency, human readability, and generating high-quality sum-
maries. In this work, we have presented the Text-To-Text Transfer Trans-
former model for the task of abstractive summarization with knowledge
representation. The experimental results showed that the Text-to-Text
Transfer model produces more conceptual, comprehensible, and abstrac-
tive summaries. To evaluate the quality of the generated summaries, the
ROUGE and BLEU score has been taken into consideration.
1 Introduction
2 Related Work
Since 1950, the text summarisation has been investigated and most of the re-
search focused to extractive summarisation by analysing the structure of the
words in the document [14]. Recently, recurrent neural network (RNN) has shown
strong influenced on Image recognition, machine translation and speech recog-
nition. Motivated from this, baotian et al., [7] have introduced the Large-Scale
Chinese Short Text Summarization (LCSTS) dataset and used the RNN-based
method for the task of text summarization. The LCSTS contains almost 2 mil-
lion texts in Chinese with their summary. The LCSTS dataset also includes the
relevance of the 10,666 texts with their corresponding source summary.
Initially, the attentional encoder-decoder RNN model has shown remark-
able performance for the task of machine translation. Moreover, Ramesh et at.,
[13] has deployed this off-the-shelf attentional encoder-decoder RNN model for
the task of text summarization on two different datasets (DUC Corpus and
CNN/Daily Mail Corpus) and achieved the state-of-the-art performance. In ad-
dition to this, a new dataset has been proposed for multisentence summarization.
The generative adversarial network has one of the recognized deep learning mod-
els for the task of text generation. Linqing et al. [11] use the generator as an
Title Suppressed Due to Excessive Length 3
agent of reinforcement learning, which takes the text as input and generates
the abstractive summaries. Besides, a discriminator has been developed to help
separate the generated summary from the ground truth summary. The guiding
generation model has combined the capability of abstractive and extractive sum-
marization methods [9]. Herein, the extractive summarization method has been
used to attains the keywords from the text. Afterwards, a Key Information Guide
Network (KIGN) encodes the keywords to the key information which guided the
summary generation process. The proposed method, unlike prior studies that
only used a single encoder, uses a dual encoder, namely primary and secondary
encoders [17]. The primary encoder, in particular, performs conventional coarse
encoding, whilst the secondary encoder models the relevance of words and pro-
vides finer encoding depending on the input raw text and previously generated
output text summarization. The two-level encodings are merged and sent into
the decoder to provide a more diversified summary, which can reduce the recur-
rence of the long sequence generation. A multi-head attention summarization
(MHAS) model [10] that learns important information in distinct representation
subspaces using a multi-head attention mechanism. To prevent the generated
summary from duplicate repetition of the terms, the MHAS model accounts the
previously predicted words while generating new words. It can also learn the
article’s underlying structure by adding a self-attention layer to the typical en-
coder and decoder, and allowing the model to effectively maintain the original
data. To increase the model’s performance, multi-head attention distribution has
been integrated to the pointer network.
3 Dataset Description
CNN/Daily Mail dataset from Hugginface has been used to test, train and vali-
date the proposed model. Each article has an average of 28 sentences. Preparing
the CNN/Daily Mail dataset for the summarization task is one of our attempted
contributions. Mainly, news articles and highlight sentences form the dataset.
The highlight sentences are concatenated in the summarization setting to gen-
erate an article summary. We adopted parallel summarization in this task using
this dataset. For training, 2,86,817 documents have been used, 13,368 documents
have been used for validation, and 11,490 documents have been used for testing.
The metadata of the used dataset in shown in Table 1.
4 System Architecture
The proposed approach constitutes two important modules, i.e., abstractive sum-
mary generation and knowledge graph representation. The summary generation
module generates the summary, whereas the knowledge graph representation
module represents the key concept of the document in the form of nodes and
edges where a node represents the subject & objects and edges represents the
verbs. The workflow of the proposed approach is depicted in Figure 1 where each
module works sequentially.
5 Experimental Results
The proposed approach has been tested on the CNN/Daily Mail dataset which
consist news articles and their corresponding summary. Moreover, the efficiency
of the proposed approach is measured in terms of ROUGE-1, ROUGE-2, ROUGE-
L, and BLEU. All these parameters are calculated for each test document and
then the average is taken for all the documents present in test set. To obtained
this ROUGE score, the semantic comparison has been performed between source
and predicted summary. The proposed approach delivers the noticeable results
value for the task abstractive text summarization. The experimental results value
has been compared with the state-of-the-art Sequence-to-Sequence RNN ap-
proach [13] where proposed approach better results value. The Table 2 shows
6 Authors Suppressed Due to Excessive Length
the comparative analysis of the obtained results value. In figure 3, we have shown
the comparative analysis of the summary generated by the proposed system and
the source summary where evaluation score 3 depicts that the approach generates
the semantically correct summary without loosing any important information,
score 2 depicts that the approach generates the partially similar summary to
source similar, and score 1 depicts that the generates summary is not similar to
source similar.
Acknowledgement
The work presented here falls under the Research Project Grant No. IFC/4130/DST-
CNRS/2018-19/IT25 (DST-CNRS targeted program). The authors would like to
express gratitude to the Centre for Natural Language Processing and Artificial
Intelligence Lab, Department of Computer Science and Engineering, National
Institute of Technology Silchar, India for providing infrastructural facilities and
support.
References
1. Alguliyev, R.M., Aliguliyev, R.M., Isazade, N.R., Abdi, A., Idris, N.: Cosum:
Text summarization based on clustering and optimization. Expert Systems 36(1),
e12340 (2019)
Title Suppressed Due to Excessive Length 9
2. Allahyari, M., Pouriyeh, S., Assefi, M., Safaei, S., Trippe, E.D., Gutierrez,
J.B., Kochut, K.: Text summarization techniques: a brief survey. arXiv preprint
arXiv:1707.02268 (2017)
3. Babar, S., Patil, P.D.: Improving performance of text summarization. Procedia
Computer Science 46, 354–363 (2015)
4. Day, M.Y., Chen, C.Y.: Artificial intelligence for automatic text summarization. In:
2018 IEEE International Conference on Information Reuse and Integration (IRI).
pp. 478–484. IEEE (2018)
5. DeJong, G.: An overview of the frump system. Strategies for natural language
processing 113, 149–176 (1982)
6. Fattah, M.A., Ren, F.: Automatic text summarization. World Academy of Science,
Engineering and Technology 37(2), 192 (2008)
7. Hu, B., Chen, Q., Zhu, F.: Lcsts: A large scale chinese short text summarization
dataset. arXiv preprint arXiv:1506.05865 (2015)
8. Hu, T., Liang, J., Ye, W., Zhang, S.: Keyword-aware encoder for abstractive text
summarization. In: International Conference on Database Systems for Advanced
Applications. pp. 37–52. Springer (2021)
9. Li, C., Xu, W., Li, S., Gao, S.: Guiding generation for abstractive text summariza-
tion based on key information guide network. In: Proceedings of the 2018 Confer-
ence of the North American Chapter of the Association for Computational Linguis-
tics: Human Language Technologies, Volume 2 (Short Papers). pp. 55–60 (2018)
10. Li, J., Zhang, C., Chen, X., Cao, Y., Liao, P., Zhang, P.: Abstractive text sum-
marization with multi-head attention. In: 2019 international joint conference on
neural networks (ijcnn). pp. 1–8. IEEE (2019)
11. Liu, L., Lu, Y., Yang, M., Qu, Q., Zhu, J., Li, H.: Generative adversarial network
for abstractive text summarization. In: Thirty-second AAAI conference on artificial
intelligence (2018)
12. Luhn, H.P.: The automatic creation of literature abstracts. IBM Journal of research
and development 2(2), 159–165 (1958)
13. Nallapati, R., Zhou, B., Gulcehre, C., Xiang, B., et al.: Abstractive text summariza-
tion using sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023
(2016)
14. Nenkova, A., McKeown, K.: Automatic summarization. Now Publishers Inc (2011)
15. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li,
W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text
transformer. arXiv preprint arXiv:1910.10683 (2019)
16. Rudra, K., Goyal, P., Ganguly, N., Imran, M., Mitra, P.: Summarizing situational
tweets in crisis scenarios: An extractive-abstractive approach. IEEE Transactions
on Computational Social Systems 6(5), 981–993 (2019)
17. Yao, K., Zhang, L., Du, D., Luo, T., Tao, L., Wu, Y.: Dual encoding for abstractive
text summarization. IEEE transactions on cybernetics 50(3), 985–996 (2018)