Kcomt Lopez Wong2022

Framework for automating requirement elicitation
using a chatbot
Luis Kcomt Lam Cesar Andres Lopez Hurtado Lenis Wong Portillo
Faculty of Engineering Faculty of Engineering Faculty of Engineering
Universidad Peruana de Ciencias Universidad Peruana de Ciencias Universidad Peruana de Ciencias
Aplicadas Aplicadas Aplicadas
Lima, Peru Lima, Peru Lima, Peru
0000-0002-8832-8929 0000-0001-9716-0155 0000-0002-5032-3233
Abstract— Requirement elicitation determines the success of implementing a system that offers automated US extraction
a project given that it describes the needs of the context and the via a structured interview, between users and a chatbot, and
problem at hand. However, a recent Chaos Report showed that other software management tools for correctly handling
projects are increasingly failing in large numbers. This is software requirements. We rely on IWA for the identification
generally due to the poor quality of requirements that are a of intents and entities, and we implement an algorithm for the
result of an erroneous requirement elicitation phase (REP). This extraction of US based on the identified elements.
affects the scalability of the REP in complex environments. In
this paper, we propose a framework to automate the REP, II. RELATED WORK
validated by a system that consumes a chatbot service from IBM
Watson Assistant (IWA). The validation is done by comparing Regarding the related work, the methodology proposed by
traditional requirement elicitation with requirement elicitation [2] was used to obtain and analyze several articles that were
using our tool with the help of a Peruvian software company. aligned with the objectives of our research. The methodology
Finally, we demonstrate that requirement elicitation with our consists of 3 phases: planning, research and results, and
implemented tool is more efficient than traditional requirement analysis.
elicitation and results in higher quality requirements.
First, for the planning phase, 4 research questions were
Keywords— User Stories; IBM Watson Assistant; posed that served as objectives: RQ1: What aspects have been
Requirement elicitation; Chatbot touched in the automation of RE? RQ2: What techniques have
been applied in RE? RQ3: What natural language processing
I. INTRODUCTION approaches have been applied to integrate chatbots with
requirement elicitation? and RQ4: What factors affect the
Software requirements are characteristics of a system
requirements REP?
expressed by the stakeholders to delimit its functionality.
These requirements are captured during a process called the Second, for the research phase, a search was done to find
REP [1]-[3]. This process is traditionally carried out between related studies that answered at least one of the research
users and analysts and determines the success of the later questions. The search was done manually, using the following
phases of a products life cycle [4] [5]. However, this phase is keywords: "natural language processing", "requirement
being carried out erroneously. This can be seen in the 2019 elicitation", "chatbot", "requirement engineering",
CHAOS Report [6], where it was estimated that around 52.7% "automation in requirement engineering". It was done through
of all projects end with failures, and 31.1% fail; that is, only search engines such as Scopus, IEEE, Science Direct, etc.
about 16.2% of all projects are completely successful. The Mostly, only articles that had been published in the last 3 years
Chaos report considered these failures mainly a consequence and had a high level of impact (Q1, Q2) were chosen.
of lack of resources, lack of planning and poor-quality
requirements [7]. Third, for the results phase, 10 papers were chosen and
analyzed based on the exclusion criteria mentioned
Currently, there are some tools that help improve how previously. The analysis of the papers was done by classifying
analysts carry out the REP, nevertheless, most are oriented these in different taxonomies created based on the research
towards users with medium to high levels of experience [8]. questions posed (see Table I).
Therefore, the REP cannot be applied in large or complex
contexts, since it implies involving novice users in the REP, TABLE I. TAXONOMY OF COLLECTED PAPERS
resulting in poor quality requirements [2]. Taxonomy References
Regarding recent approaches in the field of chatbots and Aspects (RQ1) [1], [5], [8], [13]
Techniques (RQ2) [4], [5], [8], [9], [10], [14], [13]
requirement engineering (RE). There are various methods for
Approaches (RQ3) [1], [3], [5], [8]
extracting functional requirements (FR) and non-functional Factors (RQ4) [5], [11], [13]
requirements (NFR). However, few result in user stories (US)
as such, most provide matrices, trees, or attributes,
Regarding the classification “aspects” touched in
consequences, and core values (ACV) chains [10]. In addition,
automation, the following results were obtained: automatic
few studies used chatbots to capture requirements.
elicitation [1], cloud computing [13], automatic extraction [4]
This study is motivated by the impact that REP has on the and automatic correction [14]. Automatic extraction mainly
success of a software project [4], and the difficulty it has when contained techniques that extracted the NFR based on
scaling [8], due to lack of novice-oriented tools. Therefore, we documents or other FR, whereas automatic elicitation is
propose a framework for automating the REP via the use of a based on capturing the requirements automatically in real
chatbot provided by IWA. We validate our framework by time.
XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE

Regarding the classification “techniques” used in RE, the subconscious motives of the user by asking similar questions
following was identified: using chatbots [1], for NFR [3], for repeatedly [15]. Therefore, when the user wanted to write a
FR [2], based on templates [14] and based on machine requirement, the chatbot asked about the attributes,
learning [13]. Studies that applied chatbot techniques for RE consequences, and values of the system or product in mind. In
did so by replacing the role of the customer [9] and by a real context, the interviewer (a human) would ask why?
replacing the role of the analyst [1] [8]. (consequence) as many times as he deemed necessary since it
allows him to find better and more important reasons.
For the integration “approaches” classification, the However, due to the chatbot's limitations of differentiating
following sub-classifications were obtained: PLN [8]-[10] between "mature" and "immature" consequences, the chatbot
and structured interviews [13]. This last sub-classification asked why? a random number of times, between 2 and 4; to
aligns with the objective of our project as it allows novice help the user to identify his underlying reasons. Table II
analysts to extract high-quality requirements from novice describes the pseudo-code on the communication described
clients, a quality that our chatbot must have if we wish to previously between chatbot and user.
make the REP scalable.
Finally, regarding the classification of “factors” that TABLE II. COMMUNICATION BETWEEN CHATBOT AND USER
impact the REP, the following results were obtained: Communication between chatbot and user
activities of the requirements REP [11] and impact factors 1. Chatbot greets the user (based on time of day) and asks for name
[5]. For both sub-classifications, [2] sheds light on the fact 2. User responds
that there are 7 activities that make up the REP. These are 3. If chatbot identifies a name, then
4. The name is used in following messages
affected positively or negatively by certain factors such as 5. Else
constant staff, level of autonomy between analyst and client, 6. Subsequent messages are not personalized with name
etc. These activities must be carried out correctly so that high- 7. User indicates that he wants to start structured interview
quality software requirements may be captured. 8.
Chatbot confirms and asks about a functionality that the product
should have
III. PROPOSED FRAMEWORK 9. User responds
10. If chatbot identifies an attribute, then
Our proposed framework serves to automate the REP. To The user is asked about the consequence of the
11.
validate it, a system (Reqbot) was implemented through 4 functionality n times (2-4).
phases: chatbot configuration, communication design 12. User responds
between chatbot and user, design, and implementation of the 13. If chatbot identifies a consequence attribute, then
If the chatbot has identified a role during the
tool (see Fig. 1). 14.
interview
15. End interview and create US
16. Else
17. Ask for which role this functionality belongs
18. User responds
19. End interview and create US
20. Else
21. Ask again about consequence
22. Else
23. Ask again about functionality
Fig. 1. Conceptualization of proposed framework: Reqbot
3.2 Chatbot Configuration
The system consisted of a web application composed of a In this phase, the virtual assistant was configured using its
backend, to manage the connection with IWA and database, graphical interface to properly converse with the user. The
and frontend. We used the IWA to identify the entities and configuration of the chatbot was carried out through 3 steps:
intents of the user’s message and respond. The system defining intents, defining entities and configuration of
provided tools for managing US and projects. Each user dialogue nodes.
belonged to a project and each project had its respective US.
3.2.1 Chatbot Configuration
3.1 Communication Design Between Chatbot and User Intents are tags that describe the implicit intent of the user's
This phase consisted of designing the communication message. For our case, 13 intents were defined (see Table III).
between the user and the chatbot. For this, the flow of dialogue
was considered, and how this would play into the extraction TABLE III. 8 OF 13 INTENTS DEFINED IN IWA
of US. This phase was carried out in 2 steps. Intents Description Examples
3.1.1 Flow Design of Dialog Nodes #greetings User is greeting ¡Hello!
The user wants to know how How do I submit a
In IWA, dialog nodes are used to execute possible #information_how
to submit a requirement requirement?
conversations that the chatbot can have with the user based on #information_who
The user wants to know
¿Who are you?
a combination of intents and entities. At the beginning of the about the chatbot
conversation, the chatbot asked the user for his name. Each As a user, I want to
#writePartOfRequir Intent used to label log in with ID to be
dialog node had 2 branches based on whether the user had ement contextual entities. able to access the
given his name or not; each branch had a minimum of 3 system faster.
possible dialogue options. The user wants to learn Who are the
#sys-info-analysts
about the analysts analysts?
3.1.2 Structured Interview Design The user wants to learn What happens after I
# sys-info-US
The structured interview was implemented based on the about the US submit a US?
laddering technique, an interview approach to find the The user wants to learn How do projects
# sys-info-project
about projects work?
Intents Description Examples developed using Nodejs and deployed on a Heroku server. The
The user wants to start the backend manages the connection with the IWA API and the
#startInterview structured interview I want to write a US
(submit requirement)
connection with a MySQL database. On the other hand, the
frontend was developed using ReactJs with Boostrap UI
Every time an intent or entity was defined, the chatbot was elements and was also deployed on a Heroku server.
automatically trained. At least 5 examples were given per
IV. VALIDATION
intent for a robust training phase.
4.1 Methodology
3.2.2 Defining Entities
To validate our proposed framework, a web application
Entities are data structures that identify relevant
information from user messages. Since we sought to extract (Reqbot) was implemented. The validation was based on the
requirements in the form of US, the attribute (functionality), comparison of two scenarios (see Table V). One where the
consequence (result) and role entities were defined. Name REP was done in a traditional way and another using the
entity was to personalize the dialog as mentioned previously proposed tool Reqbot. A software factory company in Peru
(see Table IV). was considered for the study sample.
Each scenario was carried out by 4 employees of the
TABLE IV. ENTITIES DEFINED IN IWA company. The employees were divided into groups of pairs.
Entities Description Examples
Within each group, one employee took the role of the client
@attribute US “functionality” part Two factor authentication and another of the analyst. For the first scenario, the REP was
@concequence US “result” part For more security carried out in a traditional way, manually in an excel
Username to use in document. Secondly, the REP was done using Reqbot. The
@name Bruno
dialogue with him.
client used the web platform to dialogue with the chatbot. The
@role US “role” part Administrative user
chatbot internally captured the requirements and extracted the
We used contextual entities to define @role, @attribute US. Then, the analyst refined the US with our tool. Finally,
and @name, so that the chatbot could identify these entities the client exported the refined US through a pdf file for each
based on their context and not through their character project. For each scenario, the groups had the task of
composition. For this, we used the examples of the intent capturing requirements based on 3 project descriptions
#writePartOfRequirement to label its entities. In Fig. 2, each provided by us. An REP was carried out per project, that had
example of the intent is tagged with a blue label, indicating to a duration of 10 minutes. Table V details the relationship
IWA that it’s an entity. between time, projects, and scenarios.
3.2.3 Dialog Node Configuration TABLE V. SCENARIOS FOR VALIDATION
This activity consisted of configuring the dialogue nodes # Scenarios # Projects Time per project
with conditionals, intents, and entities. For example, if the 1 Traditional Requirement elicitation 3 10 minutes
chatbot detected the attempt to #startInterview, it would go to 2 Requirement elicitation using Reqbot 3 10 minutes
the structured interview dialogue branch (see Fig. 2). 24
dialogue nodes were defined to cover some combinations of As mentioned previously, the projects were provided
the 14 intents and 4 entities. through 3 descriptions. The first project consisted of a mobile
application to find parking in Lima. The second consisted of
a mobile application to provide restaurant delivery
information and the third, a static web page to display a
clothing catalog that is based on the season.
4.2 Measurements
Efficiency, taken from ISO 9126 [16], was used to
compare the performance of the 2 scenarios (see Table VI). It
was calculated for each project; dividing the number of US
produced by the duration of the REP. The efficiency of each
scenario was found based on the average efficiency of all the
projects. Also, based on a study by [13], completeness,
correctness, and verifiability were used to determine the
Fig. 2. Node Dialog Flow of Structured Interview with IBM Watson quality of the US obtained using the Reqbot system (see Table
VI). A survey [17] was used to calculate these measurements.
3.3 Tool Design and Implementation
In this phase, the web application was implemented. There TABLE VI. MEASUREMENTS USED
are 3 types of users within the system, the owner of the Measurements Description Types Formula
organization, who creates projects and assigns clients or Number of user stories # 𝑈𝑆
analysts to these projects, the client who writes the US for the Efficiency captured between the Quantitative 𝐷𝑢𝑟𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑒𝑙𝑖𝑐𝑡𝑎𝑡𝑖𝑜𝑛
projects and the analyst who refines the submitted US. After duration of the capture.
the US are refined, all users in the organization can export User history has all the
them via pdf. Every US is automatically versioned within the necessary and required
Completeness Qualitative Survey average
information by the
system. This phase is made up of 2 activities: backend stakeholders.
development and frontend development. The backend was
User history complies For future work, it is proposed to use natural language
Correctness with what is requested Qualitative Survey average processing in the refinement stage to improve the quality of
by the stakeholders.
You can verify the already captured requirements by working on certain
functionality of the user qualities such as ambiguity and traceability.
Verifiability Qualitative Survey average
story when
implemented. REFERENCES
[1] Dwitama, F., & Rusli, A. (2020). User stories collection via interactive
The survey was designed using a 5-point Likert scale (0 = chatbot to support requirements gathering. TELKOMNIKA
Totally disagree, 1 = Disagree, 2 = Neither agree nor (Telecommunication Computing Electronics and Control), 18(2), 890.
https://doi.org/10.12928/telkomnika.v18i2.14866
disagree, 3 = Agree, 4 = Strongly agree). The survey
[2] Wong, L., Mauricio, D., & Rodriguez, G. (2017). A systematic
consisted of 5 questions, each one relating to at least one of literature review about software requirements elicitation. Journal of
the qualities (see Table VII). Engineering Science and Technology, 12(2), 296 - 317. Available on:
http://jestec.taylors.edu.my/V12Issue2.htm
TABLE VII. MEASUREMENTS USED [3] Shreda, Q., & Hanani, A. (2021). Identifying Non-functional
Requirements from Unconstrained Documents using Natural Language
# Questions Qualities Processing and Machine Learning Approaches. IEEE Access.
Do you think that the user stories did not lack any Published. https://doi.org/10.1109/ACCESS.2021.3052921
1 Completeness
information?
[4] Raharjana, I. K., Siahaan, D., & Fatichah, C. (2019). User Story
Do you consider that the user stories have enough Extraction from Online News for Software Requirements Elicitation:
2 Completeness
information to be implemented? A Conceptual Model. 2019 16th International Joint Conference on
Do you consider the requirements correct? (Without Computer Science and Software Engineering (JCSSE). Published.
3 Correctness
obvious or logical errors) https://doi.org/10.1109/jcsse.2019.8864199
Do you consider that the requirements comply with what
4 Correctness [5] Bano, M., Bano, M., Zowghi, D., Ferrari, A., Spoletini, P., & Donati,
was requested by the client? B. (2018). Learning from Mistakes: An Empirical Study of Elicitation
5 Do you consider that the requirements are verifiable? Verifiability Interviews Performed by Novices. 2018 IEEE 26th International
Requirements Engineering Conference (RE). Published.
https://doi.org/10.1109/re.2018.00027
V. RESULTS
[6] The Standish Group International Incorporated (2019). The chaos
This section shows the results obtained in the validation. manifesto: Think big, act small.
Table VIII compares the results between scenario 1 and 2, [7] OpenDoor. (2019, 20 febrero). The Standish Group report 83.9% of IT
based on the quantitative measurement of efficiency and the projects partially or completely fail. Open Door.
https://www.opendoorerp.com/the-standish-group-report-83-9-of-it-
qualitative measurements of completeness, correctness, and projects-partially-or-completely-fail/
verifiability. [8] Rietz, T., & Maedche, A. (2019). LadderBot: A Requirements Self-
Elicitation System. 2019 IEEE 27th International Requirements
TABLE VIII. JOINT RESULTS OF BOTH SCENARIOS Engineering Conference (RE). Published.
https://doi.org/10.1109/re.2019.00045 d
Measurements Scenario 1 Scenario 2 [9] Laiq, M., & Dieste, O. (2020). Chatbot-based Interview Simulator: A
Efficiency (requirements per minute) 1.20 1.32 Feasible Approach to Train Novice Requirements Engineers. 2020
Medium score of completeness 0.75 3.38 10th International Workshop on Requirements Engineering Education
Medium score of correctness 2.13 3.25 and Training (REET). Published.
Medium score of verifiability 2.25 3.50 https://doi.org/10.1109/reet51203.2020.00007
[10] Gunes, T., & Aydemir, F. B. (2020). Automated Goal Model
The difference between the traditional REP using the Extraction from User Stories Using NLP. 2020 IEEE 28th International
Requirements Engineering Conference (RE). Published.
Reqbot system is around 0.12 US per minute. In other words, https://doi.org/10.1109/re48521.2020.00052 d
within a 30-minute capture Reqbot would have an advantage [11] Wong, L., & Mauricio, D. S. (2019). Qualities That The Activities Of
of 3.6 requirements captured over traditional REP. Also, US The Elicitation Process Must Meet To Obtain A Good Requirement.
captured using the Reqbot system obtained on average 2.63 Journal of Engineering Science and Technology, 14(5), 2883–2912.
more points in completeness, 1.12 points in correctness score Available on: http://jestec.taylors.edu.my/V14Issue5.htm
and 1.25 points in verifiability. [12] IBM. (2021, 22 abril). IBM Cloud Docs. Cloud.Ibm.Com/Docs.
https://cloud.ibm.com/docs/assistant?topic=assistant-dev-process
VI. CONCLUSIONS AND FUTURE WORK [13] Younas, M., Jawawi, D. N. A., Ghani, I., & Shah, M. A. (2019).
Extraction of non-functional requirement using semantic similarity
In this study, a framework was proposed to automate the distance. Neural Computing and Applications, 32(11), 7383–
REP, validated through a web app that consumed an IWA 7397. https://doi.org/10.1007/s00521-019-04226-5 d
chatbot cloud service: Reqbot. This chatbot was configured [14] Kamalrudin, M., Mustafa, N., & Sidek, S. (2018). A Template for
Writing Security Requirements. Communications in Computer and
by defining a series of intents, contextual entities, and Information Science, 73–86. https://doi.org/10.1007/978-981-10-
dialogue nodes. The chatbot was designed to support the 7796-8_6 d
extraction of US based on the laddering technique. [15] T., Enders, J., Hawley, M., Deeley, C., C., S., Cover, K., N., Keeler,
To corroborate the proposal, an experiment was R., & Rahman, S. U. (2009). Laddering: A Research Interview
implemented to compare 2 scenarios, one by doing the REP Technique for Uncovering Core Values. UXmatters.
https://www.uxmatters.com/mt/archives/2009/07/laddering-a-
traditionally and the other using the proposed tool Reqbot. research-interview-technique-for-uncovering-core-values.php
The experiment was carried out with 4 employees from a [16] ISO 9126 - INFORMATICA. (s. f.). Informaticamcprats.
software factory company in Peru, based on quantitative and https://sites.google.com/site/informaticamcprats/iso-9126
qualitative measurements. It was shown that Reqbot [17] Google Forms. (2021). Encuesta para determinar la calidad de los
performed better than traditional REP, beating it by 12% requisitos captados. Retrieved (29 de octubre de 2021), from
https://docs.google.com/forms/d/e/1FAIpQL
more US captured per minute, and by capturing higher quality
US based on completeness, correctness, and verifiability.

Kcomt Lopez Wong2022

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kcomt Lopez Wong2022

Uploaded by

Copyright:

Available Formats

Framework for automating requirement elicitation

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE

You might also like