859 Submission

De-noising Tail Entity Selection using Spider Monkey Optimizer for Fine-tuning
T5 Transformer towards Automatic exam question generation

R. Tharaniya sairaj1*, S. R. Balasundaram2
1
Research Scholar, Department of Computer Applications, National Institute of Technology,
Tiruchirappalli, Tamilnadu, India.
2
Professor, Department of Computer Applications, National Institute of Technology, Tiruchirappalli,
Tamilnadu, India.
*
Corresponding Author
{1tharaniyasairaj@gmail.com, 2blsundar@nitt.edu}
Abstract
Automatic question generation is the task of producing questions from a given text
passage, with neural approaches currently achieving state-of-the-art results. While
generating questions, selection of tail entity (Named Entity in the later part of the
question) is a critical aspect and neural network models require improvement in the
process. In addition, most exam question generation approaches can generate questions
that are too general without tail entities, where lack of specificity can end up with
redundant questions. To address the challenge, proposed work aims to learn a subject
domain constrained dependency parser using a spider monkey optimizer for tail entity
extraction. The extracted tail entities are ranked by relatedness using a Composite
Lexical Similarity Metric (CLSM). Finally, a T5 Transformer is fine-tuned using the
generated entity as a parameter for the question generation task. The proposed approach
is evaluated on two datasets, each with 1k and 2k sentences respectively, and the results
compared against existing baseline models. Our experimental results show that the
proposed fine-tuned T5 based question generation approach outperforms other
techniques over existing baselines in terms of ROUGE-2, ROUGE-L and WER.
Keywords: Named Entity Ranking, Dependency Parsing, Spider Monkey

Optimizer, T5 Transformer, Automatic Question Generation
Introduction
One of the ways that AI has been transforming assessment [1,2,3,13,18] is through the
usage of neural network models for the Automatic Question Generation (AQG). A
critical aspect in the AQG process is the selection of the tail entity [14,15], the named
entity occurring in later part of the target question (Figure 1), where neural network
models require improvement.
Figure 1 Tail Entity of a Question
The tail entity selection is highly required to exploit the generative power of neural
network models, reducing human effort and time in generating non-redundant questions
[18,19,20,21]. By carefully selecting the tail entity, tutors fine-tune the generated
questions to learning objectives. So the benefits of tail entity extraction must be
integrated with advanced neural network models for automatic generation of effective
questions [Tables 1 and 2].
Table 1 Tail Entity based AQG - Example
Source Text Tail Entities Target Questions

A database management 1. data retrieval 1. Write down the
system (DBMS) is a 2. organize and steps for data
software tool that allows manage data retrieval in DBMS.
users to store, organize, 3. data insertion 2. Explain how
and manage data in a 4. data deletion DBMS can
database. It provides organise and store
various functionalities data.
such as data retrieval, data 3. Compare data
insertion, data deletion, insertion and data
and data modification deletion.
Table 2 Significance of Tail Entity in Question Generation
Question without
Subject Tail Entity Tail Entity Question with Tail Entity
DBMS Explain deadlock deadlock Describe the working of
avoidance in detail Avoidance Banker’s algorithm for deadlock
avoidance
OS What are processes process and Compare process and thread in
and threads? thread operating system
DBMS How to manage manage memory Analyze the working of virtual
memory resources? resources memory to manage memory
resources
OS List down various process state Analyze the role of short term
process state transition scheduler in process state
transitions transition
In this line, popular neural network based language models (architectures) such as
RNNs, GRUs, LSTMs, Transformers etc. may require effective representational
learning or fine-tuning process. Few contributions [3,4,5,10,13,17] have aligned with
the requirement of fine-tuning transformer based language models. But improvement
is also required at the representational tier of the model, supported through tail entity
selection process, where dependency parsing is more demanding [7,8,9,10].
Figure 2 Evidence of Inter-phrasal dependencies in Training Sample

Figure 3 Syntactic Parsing of Training Sample
Transition based parsing tools have been deployed by many to capture dependency
relationships from source text [9,10] (Figures 2 and 3). This process helps in appropriate
surface realization of target questions. Meantime, pruning noisy (inter-phrasal)
dependencies is highly required to extract tail entities of sufficient length. This involves
two major functions namely a.tuning dependency parser over syntactic parse tree of
source sentence b.pruning inter-phrasal dependency weights using argmin function of
an optimizer. For selecting the most related entity to the context of the source text,
ranking of extracted entities from knowledge graph [5,6] using lexical similarity
analysis is proposed. Based on the above observations, three objectives are focused in
the proposed work, namely
1. Using Spider Monkey Optimizer (SMO) for tail entity extraction.
2. Ranking tail entities by relevance using a Composite Lexical Similarity Metric.
3. Fine-Tuning a T5 Transformer with generated entity as parameter for AQG.
The rest of the paper is organized as follows. Section 2 deals with the related work on
three major categories namely, 1. Performance of transformer based language models,
2. Study on knowledge graph based approaches and 3. Learning with dependency
patterns for Automatic Question Generation. Section 3 covers the research gaps with
problem definition. Section 4 contains the methodology with the proposed solution.
Section 5 compares the performance of the proposed work with datasets under an
experimental setup. Section 6 provides the concluding remarks with future directions.
Related Work
Various approaches to question generation [2,10,13,15,19,20,21], are widely adopted
such as template-based, rule-based, neural network-based, reinforcement learning-
based, and hybrid approaches. Among them recent advancements such as the use of
pretrained language models to exploit their generative power and the adoption of
semantic based lexical similarity analysis for improving the representational ability
have been astounding. In this line, baselines from the existing works are chosen for the
literature study.
A broad range of sequence-to-sequence (Seq2Seq) models, along with feasible set of
supporting parameters and mechanisms is noteworthy. [10,13,17]. Variants of attention
technique with standard RNN architectures as well as Transformer models [18,19,20,21]
are involved in few works. It is evident from the studies that answer awareness is to be
given much importance to generate effective target questions.
Prior studies in question generation from knowledge bases relied on existing labels and
dictionaries for verbalization of entities and predicates [5,6]. The problem of generating
questions from structured knowledge bases and the need for generalizing it to unseen
predicates and entity types has been the focus of pattern constrained question generation
approaches [6,10]. Subgraph guided entity extraction methods are developed to
generate questions using template back translation. Graph based computational models
have also been used to determine relevance in context [8,10]. But ranking entities by
relevance highly requires scoring with lexical or distributional similarity metrics.
Extracting question related entities using dependency arcs was a major focus in AQG
task. Here, transition based parsing tools or libraries are adopted to extract dependency
relations from the source sentence [7,8,9]. But the implementation of temporal
constraining mechanism only restricts projective dependencies allowing inter-phrasal
dependencies in context. This elevates the problem of constraining the length of the key
phrase [7,10] to be extracted. Theoertically the entities may only span between phrases
and less across phrases. This enforces the need of pruning the inter-phrasal dependency
relationships which is challenging.
In order to empirically extract the tail entities of constrained length appropriate models
are necessary [2,15,20] which can prune the inter-phrasal dependency relationships. To
address this requirement, Spider Monkey Optimizer [11,12] is used to de-noise
dependency arcs (remove inter-phrasal dependencies) along with lexical similarity
based ranking for fine-tuning T5 model towards AQG.
Problem Definition
Let S be the source text and Q be the target exam question to be generated. S1, S2,
S3 ……… St ∈ Si : Si1, Si2, Si3 …. Sij correspond to the t sentences and their
corresponding tokens in S respectively. For each Sij extract dependency relation using
transition based parsing by setting the argmax inside an argmin learning objective as
shown in Equations 1, 2 and 3.
{Si1, Si2}∈Pi => Argmax[Dep(Si1, Si2)] (1)

{Si1}∈Pi, {Si2}∈Pi+1 => Argmin[Dep(Si1, Si2),0] (2)
{Si1}∈Pi, {Si2}∈Pi-1 => Argmin[Dep(Si1, Si2),0] (3)
Where Dep() corresponds to the dependency based score calculation model and Pi-1,
Pi, Pi+1 are sequential phrase substructures of S. ∀Si in S, set of dependency graphs D :
D1, D2 ….. Dx are generated upon the temporal sequence of Si. POS tag of each node
N ∈D is acquired and filtered as F : D|N∈Pn. Now F becomes the possible candidate
for rank generation process. ∀Fy∈F, rank, R is generated using Information Content,
IC score where R increases as IC decreases and vice-versa representing an inverse
relationship between R and IC. Based on R, tail entity selection is carried out as shown
in Equation 4.
Argmin[R(D1, D2), max] (4)
Where max corresponds to the maximum number of noun phrases which initially grow
but gets gradual, exponential reduction with increasing statements in context. The top
rank entity Er is set as parameter to the T5 generative Transformer model.
Methodology
The proposed approach aims to use SMO [11,12,16] on dependency parsing for tail
entity extraction, Composite Lexical Similarity Metric (CLSM) for tail entity ranking.
It is followed by fine-tuning the T5 model through effective parameterisation.
Figure 2 Architecture of the Proposed Work
Transition Based Dependency Parsing

Transition-based dependency parsing is a widely adopted approach in natural language
processing for extracting the grammatical structure of a sentence. This approach is used
for tail entity extraction, where a transition-based dependency parser is trained on a
large corpus of text to identify the most important words in a sentence based on their
relationship to other words. However, the implementation of a temporal constraining
mechanism in transition-based parsing tools or libraries only restricts few inappropriate
dependencies, which may still result in inter-phrasal dependencies in context. In this
line, an updated argmin objective is set to prune the inter-phrasal dependencies in
context. For each word (token)(Sij) in source sentence, extract dependency relation
using transition based parsing by setting the appropriate learning objective as shown in
Equations 4, 5 and 6 which is then updated by the Spider Monkey Optimizer (SMO).
{Si1, Si2}∈Pi => Argmax[Dep(Si1, Si2)] (4)

{Si1}∈Pi, {Si2}∈Pi+1 => Argmin[Dep(Si1, Si2),0] (5)
{Si1}∈Pi, {Si2}∈Pi-1 => Argmin[Dep(Si1, Si2),0] (6)
Where Dep() corresponds to the dependency based score calculation model and Pi-1,
Pi, Pi+1 are sequential phrase substructures of S. ∀Si in S, set of dependency graphs D :
D1, D2 ….. Dx are generated upon the sentence (Si)
Dependency de-noising using SMO

The process of inter-phrasal dependency de-noising is done to extract named entities as
suitable question tail entities for target question generation process. Here, the stack and
arc list are taken from the transition based parser, whereas the buffer is split into phrase
based sub-buffers to compute optimal dependencies in an efficient manner. The process
begins with initialization of stack, buffer and arc lists along with the transition chain of
ground truth comprising of LEFT_ARC (LA), RIGHT_ARC (RA), SHIFT (SH) AND
REDUCE (RE) operations (Table 3). POS lookup based head selection is carried out
and Levenstein Distance (LD) is chosen as the metric by the Spider Monkey Optimizer
(SMO) [13,14,19] to reach destination state which has a set of intra-dependent
dependency arcs. After each iteration, dependency arcs are updated for each phrase
based sub-buffer until the corresponding buffer becomes null. The final decision
constraint is set to a state having LD score as zero. The optimization process gets
terminated and arcs set is returned. Thereby, final decision constraint is said to be
attained (Figure 5 and 6).
Figure 3 Objective of the Spider Monkey Optimizer

Table 3 Notations used in Figure 6
Notations Explaination
S Stack having head or tail of
Dependancy Arc
A Set of Dependency Arcs
B List of tokens in the sentence
B’ Set of List of tokens in each
phrase of the sentence
B” Unique list in B’
TGT_S Set of gold standard
transitions
OP_S Set of transitions made from
B’’ to attain B’=∅ Figure 4 Flowchart of Spider Monkey
Optimizer
LD Minimum edit distance
between TGT_S and OP_S
Entity ranking using CLSM

Ranking named entities by relevance has been widely adopted for many applications
such as Question Generation, Question answering etc., by using a set of lexical or
distributional similarity metrics. In this line, similarity among the named entities is
computed based on several factors, such as their surface forms, semantic types, and co-
occurrence patterns in a corpus of text and hierarchical link from a knowledge graph.
But the inherent fuzzy nature of Knowledge Graph (KG) needs to be appropriately dealt
using defuzzifier such as First of Maxima (FOM) to generate scores from possible path
traversal. Such score must be an aggregate mechanism of various parametric constructs
such as path length, Lowest Common Subsumer (LCS), similarity with immediate
predecessor etc. To set this objective, product of metrics such as Commonness (fraction
of times a KG word sense is in source text), keyphraseness (fraction of times a KG word
sense is in source text and target question), disambiguation confidence (inverse fraction
of entity depth from LCS in KG) is computed, named Composite Lexical Similarity
Metric (CLSM) (Equation 7).
|𝑒(𝑁,𝑆𝑖𝑗)| 𝑒(𝑁,𝑆𝑖𝑗,𝑇𝑖𝑗) 1
𝐶𝐿𝑆𝑀 = ∑ |𝑒(𝑆𝑖𝑗)|
∗ ∑ |𝑒(𝑆𝑖𝑗,𝑇𝑖𝑗)| ∗ 1+𝑑[𝑒(𝑆𝑖𝑗,𝑇𝑖𝑗),𝐿𝐶𝑆] (7)
Where, N, Sij and Tij corresponds to node (entity) in KG, tokens in source text and
tokens in target question respectively. e(x,y) calculates the occurrence of a particular
sense x in source y, given a knowledge graph (KG), LCS is the Lowest Common
Subsumer corresponding to the entities for which similarity is computed, d[x] given the
depth or path length of the entity in the Knowledge Graph (if both can be traversed
through same path from LCS).
Fine-tuning T5 Transformer using prompts and entity parameters

Fine-tuning a T5 transformer involves retraining the model on a specific task or dataset
to improve its performance. In this case, the T5 transformer is being fine-tuned with
entity parameters. Entity parameters refer to the concepts being tested in a given exam
question. The T5 transformer is trained on corpus of source text containing various
entities, and is then fine-tuned on a specific set of entities whose relevance or similarity
score is greater than threshold. This becomes parametric input for the T5 model
supported by the fine-tuning process (Algorithm 1) to adjust the hyperparameters and
to optimize the performance of the AQG task.
Algorithm 1: FineTuneT5
Input: (context, Source text) pairs "(C,S)", Cognitive prompt "P" and
Parametric entity "E"
Output: FineTuned Transformer model, T5’ and Generated Question
List, Q.
Load_model(T) // T can be T5-base or T5-small
optimizer = AdamW(T5.parameters(), lr=3.0*10-4) //Initialize the optimizer
//Loading data in batches
data_loader = DataLoader(dataset=(C, S), batch_size=16, shuffle=True)
For each epoch do:

for epoch in range(N):
total_loss = 0
for batch in data_loader:
questions = []
for source,context in batch:
num_return_sequences=1
max_length=512
generated_questions = T5_generate(P, source,E, max_length,
early_stopping=True, num_return_sequences)
for question in generated_questions:
Q = Q ∪ question
invoke_optimizer()
lossB = get_BLoss(T5) //Backward Loss
lossF = get_FLoss(T5) //Final Loss
total_loss = total_loss ∪ lossF

T5' = update_weights(T5)
Return T5' , Q
Experiments
Datasets and preprocessing

Two datasets were used for the experiments in the proposed work, namely, 1k question-
answer-context pairs enhanced from the original BCLS dataset and 2K filtered
questions with context from the SQUAD dataset. In the proposed approach, the entire
dataset is converted into a continuous text body, where each training example is
comprised of a context paragraph and its associated question(s) combined into a single
sequence using a delimiter.
Experimental setup
In the proposed work for the automatic question generation task, an e2eQG pipeline
was utilized with T5 small and T5 base Transformer language models with
hyperparameters listed in Table 4. All experiments were carried out on Intel Core i7-
7700 CPU@3.6GHz with 16 GB RAM. Evaluation is conducted using a combination
of automatic metrics such as ROUGE as well as human evaluation through
crowdsourcing. A subset of generated questions is randomly selected, and human
evaluators rate their quality on a scale of 1 to 5. The mean score is calculated as the
final evaluation metric.
Table 4 Hyperparameters for T5 Model
Hyperparameter Value
INPUT LENGTH 512
OUTPUT LENGTH 32
OPTIMIZER AdamW
BATCH SIZE 32
NUMBER OF EPOCHS 10
LEARNING RATE 3.0 × 10−4
EARLY STOPPING True
PATIENCE PARAMETER 2
Results and Discussion

The proposed work is evaluated on two different datasets consisting of total of around
3k source texts, contexts and target questions. Extrinsic evaluation of the proposed
work shows that the approach performed better than existing methods in terms of
question generation with reduced Word Error Rate along with improved set of ROUGE
metrics thereby ensuring relevance of the generated tail entity. This justifies that the
proposed approach addresses the limitations of existing methods (Table 5) for question
generation by focusing on the tail entity selection process. The use of a subject domain
constrained dependency parser and Spider Monkey Optimizer improves the relevance
of the tail entity selection which in turn improves the performance of AQG process.
Table 5 Performance Comparison of AQG Approaches
SQUAD 2k Dataset BCLS 1k Dataset

ROUGE- ROUGE-
Model ROUGE-2 L WER ROUGE-2 L WER
T5-small (e2e) 0.43 0.39 0.23 0.42 0.38 0.22
T5-base (e2e) 0.46 0.42 0.23 0.44 0.4 0.2
T5-small+FOM 0.44 0.4 0.23 0.45 0.41 0.2
T5-base+FOM 0.47 0.43 0.21 0.46 0.42 0.19
T5-small+FOM+SMO 0.45 0.41 0.2 0.46 0.42 0.18
T5-base+FOM+SMO 0.47 0.43 0.19 0.47 0.43 0.17
Performance Comparison Performance Comparison

- SQUAD 2K Dataset - BCLS 1K Dataset
0.5 0.5
0.4 0.4
Scores
Scores
0.3 0.3
0.2 0.2
0.1 0.1
0 0
Models Models
ROUGE-2 ROUGE-L WER ROUGE-2 ROUGE-L WER
Figure 5 Performance Comparison - SQUAD 2K Figure 6 Performance Comparison - BCLS 1K Dataset

Dataset
The fine-tuning of the T5 transformer using generated entity as parameter allows for
the generation of more specific questions that are specific to the learning objectives,
which can be concluded by manual evaluation. Upon manual evaluation, the Cohen’s
Kappa Coefficient for inter-rater reliability is computed which gives a values of 0.72
representing high agreement among raters for a sample 25 questions generated from the
proposed T5-base+FOM+SMO model (Table 6). Whereas sample 25 questions
generated from the proposed T5-small+FOM+SMO model (Table 7) gives a kappa
score of 0.55 denoting moderate inter-rater reliability, representing a sign of ambiguity
or incompleteness in generated questions.
Table 6 Human Evaluation - T5-base+FOM+SMO Model
Evaluation Criteria Expert 1 Expert 2 Expert 3 Expert 4 Expert 5

Grammaticality 3.8 3.9 4.0 3.7 3.6
Relevance 3.9 3.7 3.8 3.5 3.6
Alignment to Learning 4.1 4.0 3.9 3.8 3.5
Objectives
Table 7 Human Evaluation - T5-small+FOM+SMO Model
Evaluation Criteria Expert 1 Expert 2 Expert 3 Expert 4 Expert 5

Grammaticality 4.1 3.9 4.0 3.8 3.7
Relevance 4.2 4.0 3.8 4.1 4.2
Alignment to learning 3.7 3.8 4.1 4.2 4.0
objectives
Conclusion and Future Work

In this work, a novel approach is proposed for inter-phrasal dependency de-noising
using Spider Monkey Optimizer (SMO) for entity extraction and a Composite Lexical
Similarity Metric (CLSM) for ranking entities by relevance in tail entity generation
process. These tail entitiesprompts in T5 Transformer model to generate target exam
questions. The proposed approach was evaluated on two datasets, each with of 1k and
2k sentences respectively, and the results were compared against existing baseline
models. Our experimental results showed that the proposed approach with fine-tuned
T5 model outperforms existing baselines in terms of WER and ROUGE-1, ROUGE-2,
ROUGE-L etc. This performance improvement is due to identification of the optimal
set of weights for the CLSM metric by SMO algorithm and effective computation of
semantic relationships using the CLSM metric for tail entity extraction and ranking
respectively.
In future, entity extraction and ranking needs to be experimented for many other low
resource domains.
Acknowledgement
This research was carried out at E-Learning and HCI Lab, Department of Computer
Applications, National Institute of Technology, Tiruchirappalli.
References
1. Wei, X., Saab, N., & Admiraal, W. (2021). Assessment of cognitive, behavioral, and
affective learning outcomes in massive open online courses: A systematic literature
review. Computers & Education, 163, 104097.
https://doi.org/10.1016/j.compedu.2020.104097
2. Das, B., Majumder, M., Sekh, A. A., & Phadikar, S. (2022). Automatic question
generation and answer assessment for subjective examination. Cognitive Systems
Research, 72, 14-22. https://doi.org/10.1016/j.cogsys.2021.11.002
3. Chan, Y.-H. & Fan, Y.-C. (2019). A Recurrent BERT-based Model for Question
Generation. In Proceedings of the 2nd Workshop on Machine Reading for Question
Answering (pp. 154-162). Hong Kong, China: Association for Computational
Linguistics.
4. Kriangchaivech, K., & Wangperawong, A. (2019). Question Generation by
Transformers. ArXiv, abs/1909.05017.
5. Aigo, Kosuke & Tsunakawa, Takashi & Nishida, Masafumi & Nishimura, Masafumi.
(2021). Question Generation using Knowledge Graphs with the T5 Language Model
and Masked Self-Attention. 85-87. 10.1109/GCCE53005.2021.9621874.
6. Elsahar, H., Gravier, C., & Laforest, F. (2018). Zero-Shot Question Generation from
Knowledge Graphs for Unseen Predicates and Entity Types. In Proceedings of the 2018
Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume 1 (Long Papers) (pp. 218-228).
New Orleans, Louisiana: Association for Computational Linguistics.
7. Li, Zuchao, He, Shexia, Zhang, Zhuosheng, and Zhao, Hai. (2018). Joint Learning
of POS and Dependencies for Multilingual Universal Dependency Parsing. Proceedings
of the 2018 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers),
65-73. https://doi.org/10.18653/v1/K18-2006.
8. Doan, X. D., Tran, T. A., & Nguyen, L. M. (2020). Effective Approach to Joint
Training of POS Tagging and Dependency Parsing Models. In L. M. Nguyen, X. H.
Phan, K. Hasida, & S. Tojo (Eds.), Computational Linguistics: PACLING 2019 (Vol.
1215, pp. 423-434). Springer. https://doi.org/10.1007/978-981-15-6168-9_35
9. Jiang, S., Li, Z., Zhao, H., Lu, B.-L., & Wang, R. (2021). Tri-training for
Dependency Parsing Domain Adaptation. ACM Transactions on Asian and Low-
Resource Language Information Processing, 21(3), 48.
https://doi.org/10.1145/3488367
10. Li, B., Fan, Y., Sataer, Y., Gao, Z., & Gui, Y. (2022). Improving Semantic
Dependency Parsing with Higher-Order Information Encoded by Graph Neural
Networks. Applied Sciences, 12(8), 4089. https://doi.org/10.3390/app12084089
11. Sharma, H., Hazrati, G., & Bansal, J. C. (2019). Spider Monkey Optimization
Algorithm. In J. Bansal, P. Singh, & N. Pal (Eds.), Evolutionary and Swarm Intelligence
Algorithms (pp. 51-66). Springer. https://doi.org/10.1007/978-3-319-91341-4_4
12. Bansal, J. C., Sharma, H., Jadon, S. S., & Clerc, M. (2023, May 12). SPIDER
MONKEY OPTIMIZATION - A Nature Inspired Optimization Algorithm.
https://smo.scrs.in/
13. Subramanian, S., Wang, T., Yuan, X., Zhang, S., Trischler, A., & Bengio, Y. (2018).
Neural Models for Key Phrase Extraction and Question Generation. In Proceedings of
the Workshop on Machine Reading for Question Answering, 78–88. Melbourne,
Australia: Association for Computational Linguistics.
14. Qu, F., Jia, X., & Wu, Y. (2021). Asking Questions Like Educational Experts:
Automatically Generating Question-Answer Pairs on Real-World Examination Data.
ArXiv, abs/2109.05179.
15. Willis, A., Davis, G., Ruan, S., Manoharan, L., Landay, J., & Brunskill, E. (2019).
Key Phrase Extraction for Generating Educational Question-Answer Pairs. In
Proceedings of the Sixth (2019) ACM Conference on Learning @ Scale (L@S '19) (pp.
1-10). Association for Computing Machinery.
https://doi.org/10.1145/3330430.3333636.
16. Sharma, Ajay, Sharma, Harish, Bhargava, Annapurna, and Sharma, Nirmala.
(2016). Power law-based local search in spider monkey optimisation for lower order
system modelling. International Journal of Systems Science, 48, 1-11.
https://doi.org/10.1080/00207721.2016.1165895.
17. Reddy, S., Raghu, D., Khapra, M. M., & Joshi, S. (2017). Generating natural
language question-answer pairs from a knowledge graph using a RNN based question
generation model. In Proceedings of the 15th Conference of the European Chapter of
the Association for Computational Linguistics: Volume 1, Long Papers (pp. 376-385).
18. Du, X., Shao, J., & Cardie, C. (2017). Learning to ask: Neural question generation
for reading comprehension. In Proceedings of the 55th Annual Meeting of the
Association for Computational Linguistics (Volume 1: Long Papers), (pp. 1342–1352).
19. Dong, L., Yang, N., Wang, W., Wei, F., Liu, X., Wang, Y., Gao, J., Zhou, M., &
Hon, H.-W. (2019). Unified language model pre-training for natural language
understanding and generation. arXiv preprint arXiv:1905.03197.
20. Zhou, Q., Yang, N., Wei, F., Tan, C., Bao, H., & Zhou, M. (2017). Neural question
generation from text: A preliminary study. In National CCF Conference on Natural
Language Processing and Chinese Computing (pp. 662-671). Springer.
21. Du, X., & Cardie, C. (2018). Harvesting paragraph-level question-answer pairs
from Wikipedia. In Proceedings of the 56th Annual Meeting of the Association for
Computational Linguistics (Volume 1: Long Papers), 1907–1917.

859 Submission

Uploaded by

Copyright:

Available Formats

You might also like

859 Submission

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

859 Submission

Uploaded by

Copyright:

Available Formats

De-noising Tail Entity Selection using Spider Monkey Optimizer for Fine-tuning

T5 Transformer towards Automatic exam question generation

Keywords: Named Entity Ranking, Dependency Parsing, Spider Monkey

Figure 1 Tail Entity of a Question

Source Text Tail Entities Target Questions

Table 2 Significance of Tail Entity in Question Generation

Figure 2 Evidence of Inter-phrasal dependencies in Training Sample

{Si1, Si2}∈Pi => Argmax[Dep(Si1, Si2)] (1)

Figure 2 Architecture of the Proposed Work

Transition Based Dependency Parsing

{Si1, Si2}∈Pi => Argmax[Dep(Si1, Si2)] (4)

Dependency de-noising using SMO

Figure 3 Objective of the Spider Monkey Optimizer

Entity ranking using CLSM

Fine-tuning T5 Transformer using prompts and entity parameters

For each epoch do:

total_loss = total_loss ∪ lossF

Datasets and preprocessing

Results and Discussion

Table 5 Performance Comparison of AQG Approaches

SQUAD 2k Dataset BCLS 1k Dataset

Performance Comparison Performance Comparison

ROUGE-2 ROUGE-L WER ROUGE-2 ROUGE-L WER

Figure 5 Performance Comparison - SQUAD 2K Figure 6 Performance Comparison - BCLS 1K Dataset

Evaluation Criteria Expert 1 Expert 2 Expert 3 Expert 4 Expert 5

Table 7 Human Evaluation - T5-small+FOM+SMO Model

Evaluation Criteria Expert 1 Expert 2 Expert 3 Expert 4 Expert 5

Conclusion and Future Work

You might also like