Professional Documents
Culture Documents
859 Submission
859 Submission
859 Submission
Introduction
One of the ways that AI has been transforming assessment [1,2,3,13,18] is through the
usage of neural network models for the Automatic Question Generation (AQG). A
critical aspect in the AQG process is the selection of the tail entity [14,15], the named
entity occurring in later part of the target question (Figure 1), where neural network
models require improvement.
The tail entity selection is highly required to exploit the generative power of neural
network models, reducing human effort and time in generating non-redundant questions
[18,19,20,21]. By carefully selecting the tail entity, tutors fine-tune the generated
questions to learning objectives. So the benefits of tail entity extraction must be
integrated with advanced neural network models for automatic generation of effective
questions [Tables 1 and 2].
Table 1 Tail Entity based AQG - Example
Question without
Subject Tail Entity Tail Entity Question with Tail Entity
DBMS Explain deadlock deadlock Describe the working of
avoidance in detail Avoidance Banker’s algorithm for deadlock
avoidance
OS What are processes process and Compare process and thread in
and threads? thread operating system
DBMS How to manage manage memory Analyze the working of virtual
memory resources? resources memory to manage memory
resources
OS List down various process state Analyze the role of short term
process state transition scheduler in process state
transitions transition
In this line, popular neural network based language models (architectures) such as
RNNs, GRUs, LSTMs, Transformers etc. may require effective representational
learning or fine-tuning process. Few contributions [3,4,5,10,13,17] have aligned with
the requirement of fine-tuning transformer based language models. But improvement
is also required at the representational tier of the model, supported through tail entity
selection process, where dependency parsing is more demanding [7,8,9,10].
Transition based parsing tools have been deployed by many to capture dependency
relationships from source text [9,10] (Figures 2 and 3). This process helps in appropriate
surface realization of target questions. Meantime, pruning noisy (inter-phrasal)
dependencies is highly required to extract tail entities of sufficient length. This involves
two major functions namely a.tuning dependency parser over syntactic parse tree of
source sentence b.pruning inter-phrasal dependency weights using argmin function of
an optimizer. For selecting the most related entity to the context of the source text,
ranking of extracted entities from knowledge graph [5,6] using lexical similarity
analysis is proposed. Based on the above observations, three objectives are focused in
the proposed work, namely
1. Using Spider Monkey Optimizer (SMO) for tail entity extraction.
2. Ranking tail entities by relevance using a Composite Lexical Similarity Metric.
3. Fine-Tuning a T5 Transformer with generated entity as parameter for AQG.
The rest of the paper is organized as follows. Section 2 deals with the related work on
three major categories namely, 1. Performance of transformer based language models,
2. Study on knowledge graph based approaches and 3. Learning with dependency
patterns for Automatic Question Generation. Section 3 covers the research gaps with
problem definition. Section 4 contains the methodology with the proposed solution.
Section 5 compares the performance of the proposed work with datasets under an
experimental setup. Section 6 provides the concluding remarks with future directions.
Related Work
Various approaches to question generation [2,10,13,15,19,20,21], are widely adopted
such as template-based, rule-based, neural network-based, reinforcement learning-
based, and hybrid approaches. Among them recent advancements such as the use of
pretrained language models to exploit their generative power and the adoption of
semantic based lexical similarity analysis for improving the representational ability
have been astounding. In this line, baselines from the existing works are chosen for the
literature study.
A broad range of sequence-to-sequence (Seq2Seq) models, along with feasible set of
supporting parameters and mechanisms is noteworthy. [10,13,17]. Variants of attention
technique with standard RNN architectures as well as Transformer models [18,19,20,21]
are involved in few works. It is evident from the studies that answer awareness is to be
given much importance to generate effective target questions.
Prior studies in question generation from knowledge bases relied on existing labels and
dictionaries for verbalization of entities and predicates [5,6]. The problem of generating
questions from structured knowledge bases and the need for generalizing it to unseen
predicates and entity types has been the focus of pattern constrained question generation
approaches [6,10]. Subgraph guided entity extraction methods are developed to
generate questions using template back translation. Graph based computational models
have also been used to determine relevance in context [8,10]. But ranking entities by
relevance highly requires scoring with lexical or distributional similarity metrics.
Extracting question related entities using dependency arcs was a major focus in AQG
task. Here, transition based parsing tools or libraries are adopted to extract dependency
relations from the source sentence [7,8,9]. But the implementation of temporal
constraining mechanism only restricts projective dependencies allowing inter-phrasal
dependencies in context. This elevates the problem of constraining the length of the key
phrase [7,10] to be extracted. Theoertically the entities may only span between phrases
and less across phrases. This enforces the need of pruning the inter-phrasal dependency
relationships which is challenging.
In order to empirically extract the tail entities of constrained length appropriate models
are necessary [2,15,20] which can prune the inter-phrasal dependency relationships. To
address this requirement, Spider Monkey Optimizer [11,12] is used to de-noise
dependency arcs (remove inter-phrasal dependencies) along with lexical similarity
based ranking for fine-tuning T5 model towards AQG.
Problem Definition
Let S be the source text and Q be the target exam question to be generated. S1, S2,
S3 ……… St ∈ Si : Si1, Si2, Si3 …. Sij correspond to the t sentences and their
corresponding tokens in S respectively. For each Sij extract dependency relation using
transition based parsing by setting the argmax inside an argmin learning objective as
shown in Equations 1, 2 and 3.
Where Dep() corresponds to the dependency based score calculation model and Pi-1,
Pi, Pi+1 are sequential phrase substructures of S. ∀Si in S, set of dependency graphs D :
D1, D2 ….. Dx are generated upon the temporal sequence of Si. POS tag of each node
N ∈D is acquired and filtered as F : D|N∈Pn. Now F becomes the possible candidate
for rank generation process. ∀Fy∈F, rank, R is generated using Information Content,
IC score where R increases as IC decreases and vice-versa representing an inverse
relationship between R and IC. Based on R, tail entity selection is carried out as shown
in Equation 4.
Argmin[R(D1, D2), max] (4)
Where max corresponds to the maximum number of noun phrases which initially grow
but gets gradual, exponential reduction with increasing statements in context. The top
rank entity Er is set as parameter to the T5 generative Transformer model.
Methodology
The proposed approach aims to use SMO [11,12,16] on dependency parsing for tail
entity extraction, Composite Lexical Similarity Metric (CLSM) for tail entity ranking.
It is followed by fine-tuning the T5 model through effective parameterisation.
Where Dep() corresponds to the dependency based score calculation model and Pi-1,
Pi, Pi+1 are sequential phrase substructures of S. ∀Si in S, set of dependency graphs D :
D1, D2 ….. Dx are generated upon the sentence (Si)
Notations Explaination
S Stack having head or tail of
Dependancy Arc
A Set of Dependency Arcs
B List of tokens in the sentence
B’ Set of List of tokens in each
phrase of the sentence
B” Unique list in B’
TGT_S Set of gold standard
transitions
OP_S Set of transitions made from
B’’ to attain B’=∅ Figure 4 Flowchart of Spider Monkey
Optimizer
LD Minimum edit distance
between TGT_S and OP_S
|𝑒(𝑁,𝑆𝑖𝑗)| 𝑒(𝑁,𝑆𝑖𝑗,𝑇𝑖𝑗) 1
𝐶𝐿𝑆𝑀 = ∑ |𝑒(𝑆𝑖𝑗)|
∗ ∑ |𝑒(𝑆𝑖𝑗,𝑇𝑖𝑗)| ∗ 1+𝑑[𝑒(𝑆𝑖𝑗,𝑇𝑖𝑗),𝐿𝐶𝑆] (7)
Where, N, Sij and Tij corresponds to node (entity) in KG, tokens in source text and
tokens in target question respectively. e(x,y) calculates the occurrence of a particular
sense x in source y, given a knowledge graph (KG), LCS is the Lowest Common
Subsumer corresponding to the entities for which similarity is computed, d[x] given the
depth or path length of the entity in the Knowledge Graph (if both can be traversed
through same path from LCS).
Algorithm 1: FineTuneT5
Input: (context, Source text) pairs "(C,S)", Cognitive prompt "P" and
Parametric entity "E"
Output: FineTuned Transformer model, T5’ and Generated Question
List, Q.
Load_model(T) // T can be T5-base or T5-small
optimizer = AdamW(T5.parameters(), lr=3.0*10-4) //Initialize the optimizer
//Loading data in batches
data_loader = DataLoader(dataset=(C, S), batch_size=16, shuffle=True)
Experiments
Hyperparameter Value
INPUT LENGTH 512
OUTPUT LENGTH 32
OPTIMIZER AdamW
BATCH SIZE 32
NUMBER OF EPOCHS 10
LEARNING RATE 3.0 × 10−4
EARLY STOPPING True
PATIENCE PARAMETER 2
0.3 0.3
0.2 0.2
0.1 0.1
0 0
Models Models
The fine-tuning of the T5 transformer using generated entity as parameter allows for
the generation of more specific questions that are specific to the learning objectives,
which can be concluded by manual evaluation. Upon manual evaluation, the Cohen’s
Kappa Coefficient for inter-rater reliability is computed which gives a values of 0.72
representing high agreement among raters for a sample 25 questions generated from the
proposed T5-base+FOM+SMO model (Table 6). Whereas sample 25 questions
generated from the proposed T5-small+FOM+SMO model (Table 7) gives a kappa
score of 0.55 denoting moderate inter-rater reliability, representing a sign of ambiguity
or incompleteness in generated questions.
Table 6 Human Evaluation - T5-base+FOM+SMO Model
In future, entity extraction and ranking needs to be experimented for many other low
resource domains.
Acknowledgement
This research was carried out at E-Learning and HCI Lab, Department of Computer
Applications, National Institute of Technology, Tiruchirappalli.
References
1. Wei, X., Saab, N., & Admiraal, W. (2021). Assessment of cognitive, behavioral, and
affective learning outcomes in massive open online courses: A systematic literature
review. Computers & Education, 163, 104097.
https://doi.org/10.1016/j.compedu.2020.104097
2. Das, B., Majumder, M., Sekh, A. A., & Phadikar, S. (2022). Automatic question
generation and answer assessment for subjective examination. Cognitive Systems
Research, 72, 14-22. https://doi.org/10.1016/j.cogsys.2021.11.002
3. Chan, Y.-H. & Fan, Y.-C. (2019). A Recurrent BERT-based Model for Question
Generation. In Proceedings of the 2nd Workshop on Machine Reading for Question
Answering (pp. 154-162). Hong Kong, China: Association for Computational
Linguistics.
4. Kriangchaivech, K., & Wangperawong, A. (2019). Question Generation by
Transformers. ArXiv, abs/1909.05017.
5. Aigo, Kosuke & Tsunakawa, Takashi & Nishida, Masafumi & Nishimura, Masafumi.
(2021). Question Generation using Knowledge Graphs with the T5 Language Model
and Masked Self-Attention. 85-87. 10.1109/GCCE53005.2021.9621874.
6. Elsahar, H., Gravier, C., & Laforest, F. (2018). Zero-Shot Question Generation from
Knowledge Graphs for Unseen Predicates and Entity Types. In Proceedings of the 2018
Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume 1 (Long Papers) (pp. 218-228).
New Orleans, Louisiana: Association for Computational Linguistics.
7. Li, Zuchao, He, Shexia, Zhang, Zhuosheng, and Zhao, Hai. (2018). Joint Learning
of POS and Dependencies for Multilingual Universal Dependency Parsing. Proceedings
of the 2018 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers),
65-73. https://doi.org/10.18653/v1/K18-2006.
8. Doan, X. D., Tran, T. A., & Nguyen, L. M. (2020). Effective Approach to Joint
Training of POS Tagging and Dependency Parsing Models. In L. M. Nguyen, X. H.
Phan, K. Hasida, & S. Tojo (Eds.), Computational Linguistics: PACLING 2019 (Vol.
1215, pp. 423-434). Springer. https://doi.org/10.1007/978-981-15-6168-9_35
9. Jiang, S., Li, Z., Zhao, H., Lu, B.-L., & Wang, R. (2021). Tri-training for
Dependency Parsing Domain Adaptation. ACM Transactions on Asian and Low-
Resource Language Information Processing, 21(3), 48.
https://doi.org/10.1145/3488367
10. Li, B., Fan, Y., Sataer, Y., Gao, Z., & Gui, Y. (2022). Improving Semantic
Dependency Parsing with Higher-Order Information Encoded by Graph Neural
Networks. Applied Sciences, 12(8), 4089. https://doi.org/10.3390/app12084089
11. Sharma, H., Hazrati, G., & Bansal, J. C. (2019). Spider Monkey Optimization
Algorithm. In J. Bansal, P. Singh, & N. Pal (Eds.), Evolutionary and Swarm Intelligence
Algorithms (pp. 51-66). Springer. https://doi.org/10.1007/978-3-319-91341-4_4
12. Bansal, J. C., Sharma, H., Jadon, S. S., & Clerc, M. (2023, May 12). SPIDER
MONKEY OPTIMIZATION - A Nature Inspired Optimization Algorithm.
https://smo.scrs.in/
13. Subramanian, S., Wang, T., Yuan, X., Zhang, S., Trischler, A., & Bengio, Y. (2018).
Neural Models for Key Phrase Extraction and Question Generation. In Proceedings of
the Workshop on Machine Reading for Question Answering, 78–88. Melbourne,
Australia: Association for Computational Linguistics.
14. Qu, F., Jia, X., & Wu, Y. (2021). Asking Questions Like Educational Experts:
Automatically Generating Question-Answer Pairs on Real-World Examination Data.
ArXiv, abs/2109.05179.
15. Willis, A., Davis, G., Ruan, S., Manoharan, L., Landay, J., & Brunskill, E. (2019).
Key Phrase Extraction for Generating Educational Question-Answer Pairs. In
Proceedings of the Sixth (2019) ACM Conference on Learning @ Scale (L@S '19) (pp.
1-10). Association for Computing Machinery.
https://doi.org/10.1145/3330430.3333636.
16. Sharma, Ajay, Sharma, Harish, Bhargava, Annapurna, and Sharma, Nirmala.
(2016). Power law-based local search in spider monkey optimisation for lower order
system modelling. International Journal of Systems Science, 48, 1-11.
https://doi.org/10.1080/00207721.2016.1165895.
17. Reddy, S., Raghu, D., Khapra, M. M., & Joshi, S. (2017). Generating natural
language question-answer pairs from a knowledge graph using a RNN based question
generation model. In Proceedings of the 15th Conference of the European Chapter of
the Association for Computational Linguistics: Volume 1, Long Papers (pp. 376-385).
18. Du, X., Shao, J., & Cardie, C. (2017). Learning to ask: Neural question generation
for reading comprehension. In Proceedings of the 55th Annual Meeting of the
Association for Computational Linguistics (Volume 1: Long Papers), (pp. 1342–1352).
19. Dong, L., Yang, N., Wang, W., Wei, F., Liu, X., Wang, Y., Gao, J., Zhou, M., &
Hon, H.-W. (2019). Unified language model pre-training for natural language
understanding and generation. arXiv preprint arXiv:1905.03197.
20. Zhou, Q., Yang, N., Wei, F., Tan, C., Bao, H., & Zhou, M. (2017). Neural question
generation from text: A preliminary study. In National CCF Conference on Natural
Language Processing and Chinese Computing (pp. 662-671). Springer.
21. Du, X., & Cardie, C. (2018). Harvesting paragraph-level question-answer pairs
from Wikipedia. In Proceedings of the 56th Annual Meeting of the Association for
Computational Linguistics (Volume 1: Long Papers), 1907–1917.