Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

International Journal of Mechanical Engineering and Robotics Research Vol. 9, No.

4, April 2020

An Overview of Machine Learning in Chatbots


Prissadang Suta1
1
School of Information Technology, King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
Email: prissadang.sut@mail.kmutt.ac.th

Xi Lan2, Biting Wu2


2
Division of Engineering Science, University of Toronto, Ontario, Canada
Email: {xi.lan, mary.wu}@ mail.utoronto.ca

Pornchai Mongkolnam1 and Jonathan H. Chan1


1
School of Information Technology, King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
Email: {pornchai, jonathan}@sit.kmutt.ac.th

Abstract—A chatbot is an intelligent system which can hold [4]. Chatbots now exist in various messaging platforms,
a conversation with a human using natural language in real such as Facebook Messenger, Skype, and Kik, largely for
time. Due to the rise of Internet usage, many businesses now customer service purposes [6].
use online platforms to handle customer inquiries, and Chatbots also evolved to interact via voice as well.
many of them turn to chatbots for improving their customer
Such chatbots are typically known as virtual assistants. In
service or for streamlining operations and increasing their
productivity. However, there is still a gap between existing
particular, the use of NLP led to the Big Four Voice
chatbots and the autonomous, conversational agents Assistants: Apple’s Siri (released as a standalone app in
businesses hope to implement. As such, this paper will first 2010, bundled into iOS in 2011, and added to the
provide an overview of chatbots and then focus on research HomePod device in December 2017), Microsoft’s
trends regarding the development of human-like chatbots Cortana (2013), Amazon’s Alexa (released with its Echo
capable of closing this technological gap. We reviewed the products in 2014), and Google’s Assistant (announced in
literature published over the past decade, from 1998 to 2018, 2016 [4], and has been an extension of Google’s Voice
and presented an overview of chatbots using a mind-map. Search product called Google Now since 2011) [7]. They
The research findings suggest that chatbots operate in three are embedded in smartphones and smart home devices to
steps: understanding the natural language input; generating
an automatic, relevant response; and, constructing realistic
control the Internet-of-Things (IoT) enabled devices.
and fluent natural language responses. The current These assistants are now using voice recognition powered
bottleneck in designing artificially intelligent chatbots lies in by AI to learn the words and phrases of the user’s voice
the industry’s lack of natural language processing in order to interact with users in a personalized manner.
capabilities. Without the ability to properly understand the For example, Audrey was the first documented speech
content and context of a user’s input, the chatbot cannot recognition system in 1952, which recognized digits
generate a relevant response. spoken by a single voice. Since then, Siri has improved to
recognize users’ voices and respond with personality.
Index Terms—chatbots, conversational agents, dialog system, Although there are still improvements to be made, voice
human computer interaction
recognition technology is becoming increasingly used in
business and commerce which can hear and understand
I. INTRODUCTION
what you are saying even in noisy environments [8]–[10].
A chatbot, also known as a conversational agent, is a In fact, these conversational interfaces were deemed one
computer software capable of taking a natural language of the key breakthrough technologies of 2016 [4].
input and providing a conversational output in real time As evident, chatbots have become quite popular over
[1]. This human-chatbot interaction is typically carried the years. This is likely due to the rise of Internet users
out through a graphical user interface based on human- worldwide – there were 3.15 billion users in 2015, 3.39
computer interaction (HCI) principles [2], [3]. billion in 2016, and 3.58 billion in 2017 [11]. There has
The idea of an intelligent machine engaging in human also been a rise in e-commerce, as shown in Fig. 1,
interactions was first theorized by Alan Turing in 1950 coupled with an increased demand for customer service
[4], [5]. Shortly after, automated computer programs, on digital platforms [12]. According to Harvard Business
referred to as “bots”, were created to simulate human Review, a mere five-minute delay could decrease a
conversation. For example, ELIZA in 1966 matched user business’s chances of selling to a customer. In fact, a ten-
prompts to scripted responses, and Artificial Linguistic minute delay could reduce their chances by 400% [13].
Internet Computer Entity (ALICE) in 1995 introduced However, a study done by Xu et al. examined one million
natural language processing (NLP) to interpret user input conversations and found that the average response time
was 6.5 hours [12]. To ameliorate this situation, some
Manuscript received August 3, 2019; revised March 1, 2020.

© 2020 Int. J. Mech. Eng. Rob. Res 502


doi: 10.18178/ijmerr.9.4.502-510
International Journal of Mechanical Engineering and Robotics Research Vol. 9, No. 4, April 2020

businesses began to employ chatbots to handle inquiries This paper is organized as follows: methodology,
24/7. Admittedly, there is still a gap between existing mind-map presenting an overview of chatbots, detailed
chatbots and chatbots intelligent enough to replace human breakdown of the different aspects of chatbots, and
representatives, but it is highly likely that chatbots will conclusions.
play a significant role in the digital future [4].
II. METHODOLOGY
This paper will review works containing the listed
keywords over the past two decades, published from 1998
to 2018, in online databases, such as Google Scholar,
IEEE Xplore, ACM, and Web of Science. After searching
for the keywords in the online databases, the two criteria
used for selecting review papers were the publication date
and the number of citations. It was important to review
papers that were recent, especially since technology
changes very rapidly over the years. Older papers were
also reviewed to understand the history and the
Figure 1. Retail e-commerce sales worldwide from 2014 to 2021 (in progression of chatbots over the years. After conducting
billion USD) [14]. the literature review, a mind-map was created to
summarize the findings, as seen in Fig. 2. The mind-
In addition to the chatbot’s increasing popularity in mapping method was used to visualize the relations
business, there have been various publications providing between different concepts from various research papers.
an overview of existing chatbot platforms, architectures, This method was chosen due to its capability of
and chatbot implementation methods [14]–[16]. One presenting complex relationships in a simple, visual form
perspective [14] mentioned two challenges in the field of [17]. The branches stemming from the center represent
chatbot development: 1) Chatbots can only recognize the main characteristics, current architectures and systems,
specific sentence structures; 2) The responses generated machine learning approaches, and applications of
by existing machine learning techniques are not always chatbots. Many of the published papers discussed
accurate or personalized. In this paper, we focus on architectures and systems, but few papers delved into the
summarizing existing machine learning techniques using machine learning approaches in the context of chatbot
the mind-mapping method. development, which is the focus of this paper.

Figure 2. An overview of the properties of chatbots.

conversations. This showed that advanced chatbots are


III. CHATBOT CHARACTERISTICS expected to not only answer questions but also learn and
Understanding the key characteristics of a chatbot is improve themselves with each conversation, and
important for designing a chatbot. These characteristics eventually be able to respond appropriately in various
were established through the study of people’s contexts [12]. To create a smart chatbot capable of doing
expectations for chatbots [18], [19], and the method used so, the main characteristics and capabilities needed are
in the study was comparing past human-human and listed as follows:
human-chatbot conversations. As mentioned in a chatbot A. Communicate Using Natural Language
survey [16], the development of chatbots went from
Chatbots should understand and respond using human
pattern matching and simple “Q&A” style to a more
natural language. Advanced automated chatbots
human-like way of carrying out and continuing

© 2020 Int. J. Mech. Eng. Rob. Res 503


International Journal of Mechanical Engineering and Robotics Research Vol. 9, No. 4, April 2020

nowadays generally use Machine Learning (ML), coupled stating their problems. Xu et al. [12] collected
with Natural Language Processing (NLP) within the conversations between humans and chatbots over social
domain of Artificial Intelligence (AI) [20]. We found that media and reported that approximately 40% of those
the anatomy of chatbots [21], used for seeking conversations were emotional. As a result, their chatbot,
information, generally has three components: Natural which used deep learning and information retrieval (IR)
Language Understanding (NLU) to categorize the user’s techniques, learned informal writing styles used in
intent, Dialogue Management (DM) to determine user’s emotional conversations and can show as much empathy
intent, and Natural Language Generation (NLG) to as human agents when chatting with the users. On a
generate a response in natural language. similar note, Chang et al. found that the word2vec
1) Understanding Natural Language Input. technique could help the machine understand emotional
NLP and ML are used to tackle the two most common texts from users [29].
problems in computational linguistics (a branch of Immediate: Realistic and Fast Response. Ideally, if the
linguistics in which computer science techniques are used user’s question requires searching through the internet,
for analyzing language). The first problem is sentiment the chatbot should search and respond as quickly as most
analysis. It aims to first identify sentiments from the search engines.
information input, which can be documents, sentences, or
B. Security
phrases, and then classify these sentiments into three
polarity groups: positive, negative, or neutral. This Security features should be applied to all the data in
polarity classification can be done using NLP approaches, the chatbot databases for user security and privacy. This
such as n-grams, adjectives labeled, dependency relations is particularly important in the field of personal-life care.
and objective terms [22]. The other problem is linguistic For example, sensitive personal information like medical
similarity. It aims to represent linguistics using different and health information should be encrypted to ensure
lexicalization levels such as words, stem, named entities, confidentiality [30]. Moreover, security features in
and so on [23]. There are three main categories to chatbots should only grant system or database access to
determine useful distinctions in the study of language: verified intended users. For example, chatbots in the form
pragmatic, the purpose of context; semantic, the meaning of smart home devices should allow only the legal and
of context; and syntax, being grammatically correct. registered residents in a house to command and control
2) Dialogue Management the devices [7], [31].
After understanding the user input, the DM stage
should choose an interaction strategy to determine a IV. CHATBOT SYSTEMS
response based on the context of the conversation [24]. In The system required to design, develop and implement
other words, the dialogue management stage classifies the a chatbot which can interpret the user’s intent and provide
question type and determines the relevant category of proper responses to questions, is studied and broken
answers the chatbot can use for responding. Many down in the following section:
research papers presented ML approaches to accomplish
A. Communication Medium
this task. For example, Support Vector Machines (SVM)
is one of the trending techniques for Q&A problems This section details current chatbot communication
[25]–[27]. platforms. Common media which allow users to input
3) Responding with Natural Language messages and communicate with a chatbot include
After determining the category of relevant answers, the messaging platforms, such as applications and cloud-
chatbot must then: 1) construct the relevant, personalized based services (Slack bot, IBM Watson), and physical
response using natural language, and 2) respond with no devices (Amazon Alexa, Google Home).
time delay. 1) Messaging Platforms and their Features
Personalized: Writing Styles and Emotions. The The messaging platform is for communicating via
responses generated by the chatbots should be online chat in real time. Many messaging platforms serve
grammatically correct and exhibit human-like behaviors different purposes with diverse needs, but they all operate
and emotions. Chatbots should converse in certain based on text, and their main purpose is dealing with
manners based on the user’s psychological state and customer service. R. Khan and A. Das [1] compared the
behavior, and they should learn to adapt their writing available features across various messaging platforms
style to the user’s need and the context of the (Facebook, Skype, Slack, Telegram, Microsoft Teams
conversation. For example, Zhang et al. [28] proposed and Viber), as seen in Table I. These features include text
personalized response generators which use different message, carousel, cards, button, quick reply, webview,
responding styles. Their generators use lexical principles group chatbot, list, audio, video, GIF, image, and
and a sequence to sequence framework with a recurrent document or file. This is to help developers choose which
neural network to detect the user’s behaviour. In platform to use for deploying their chatbots based on their
customer service, chatbots that can detect user’s emotions specific needs.
and expected reactions are heavily needed since their
users generally express their emotions in lieu of rationally

© 2020 Int. J. Mech. Eng. Rob. Res 504


International Journal of Mechanical Engineering and Robotics Research Vol. 9, No. 4, April 2020

TABLE I. MESSAGING PLATFORMS FEATURES COMPARISON [1]. level is Text I/O; the third level is many domain-specific
Features /
chatbots, each of them has a specific task such as
Tele- MS
Facebook Skype Slack Viber location-based weather reports, reminders, news, jokes,
Platforms gram Teams
and others; the fourth level consists of third party services
Text message to deploy the chatbot. The chatbot can respond with text
Carousel Partial and/or audio depending on the request.
Button

Quick reply

Web view

Group chatbot

List

Audio

Video

GIF

Image

Document/file
Figure 3. Chatbot architecture [38].

2) Smart Home Devices Summarizing these papers in [32], [38], [39], we see
In addition to commercial chatbots, which are used for that a chatbot development system can be broken down to
customer service purposes, there are also chatbots that act front-end and back-end environments. These are to
as personal virtual assistants. These assistants help users address the main tasks of chatbots proposed in the
accomplish daily tasks, such as booking a hotel, getting Characteristics section and the mind-map: understanding
the latest news, checking the weather forecasts, or even the user’s natural language input, generating a proper
buying products based on the user’s personal preferences response addressing the user’s needs, and constructing
[30], [32]–[37]. the response with natural language. The front-end system
In order for these virtual assistant chatbots to have is the chatbot’s “mouth” and “ears/eyes”, seeing what the
more functionality, more communication channels were user inputs are and responding. The back-end system is
enabled, e.g. users can use voices to control many types the chatbot’s “brain,” where algorithms and models are
of devices [31]. Current virtual assistants require a used to understand user intents and determine a suitable
physical device in the environment to act as a host for the response.
trained chatbots in the cloud. For example, Amazon Echo Front-End System. The front-end consists of the user
operates on Alexa, Google Home uses Google Assistant, interface and the communication medium. As mentioned
Apple HomePod uses Siri, and Microsoft’s virtual above, the input and output can consist of textual and
assistance with Cortana platform, and so on. A drawback audio data. Chatbots accepting audio inputs use a speech
with current virtual assistants is that they still have a recognition engine that can convert speech-to-text (STT)
closed knowledge domain, as they are only expected to and text-to-speech (TTS).
carry out certain specified and preset tasks, such as Back-End System. The back-end system is hosted on
turning on/off the lights, projectors, and air conditioning, the cloud platform. It processes inputs and generates
but a conversation topic aside from that might not be relevant responses. It is composed of a server, an AI
recognized by the assistant. In the future, a more engine, custom APIs, and a database through a REST API
improved chatbot should be able to converse openly [32].
about a variety of topics [15]. ML techniques which can help a chatbot better
understand the context of conversation or construct
B. System Architecture
suitable responses are in need. As of now, there are many
The chatbot system is now leaning towards cloud- NLP tools which can be used, such as Dialogflow from
based, open-source, and serverless data operating systems. Google, formerly known as Api.ai
Some examples are OpenWhisk from IBM [38] and AWS (https://developers.google.com/actions/dialogflow/project
Lambda from Amazon [32]. They provide an easy way to -agent). More recently, to better model user behavior,
build a chatbot using a variety of programming languages, many companies have developed AI platforms and
such as Javascript, Java, Python, etc. which enhances the released their APIs to the public. A few examples are:
functionality of backend system and thus delivers a IBM Watson (https://www.ibm.com/watson/developer/),
personalized experience to each user. To build a closed- LUIS.ai, Microsoft’s Language Understanding Intelligent
domain, serverless chatbot, Mengting et. al. [38] Service (https://www.luis.ai/), Wit.ai (https://wit.ai/), and
presented a generic architecture as shown in Fig. 3. It Amazon Lex (https://aws.amazon.com/lex/developers/).
consists of four levels: the first level is Audio I/O which
converts audio input to text and vice versa; the second

© 2020 Int. J. Mech. Eng. Rob. Res 505


International Journal of Mechanical Engineering and Robotics Research Vol. 9, No. 4, April 2020

C. Knowledge Base TABLE II. EXAMPLE OF AIML CODE [43].

The knowledge base is the information the chatbot 1 <category>


refers to during the dialogue management stage when 2 <pattern>What the user says</pattern>
generating a response. The knowledge base is the “brain” 3 <template>What the bot responds</template>
of the chatbot engine, generally containing 4 </category>
keywords/phrases and the responses associated with these
keywords/phrases [40]. The <pattern> tag is used to match in user’s input. The
Data Resources. For chatbots to obtain knowledge, <template> tag is used to respond to the pattern.
they need to extract data from different large-scale
information sources and store this information to a B. Rule-Based
knowledge base. These information sources could either The rule-based or template-based approach is used to
be structured, semi-structured, or unstructured [40]. Most map sentences with the pattern associated with the
existing chatbots currently retrieve information from collected input database. Some chatbots applying these
structured documents to build their knowledge base, since techniques are: ELIZA (created by Weizenbaum in 1966)
structured documents have labelled utterance-response and PARRY (created by Colby in 1975) chatbot. For
(or Q-R) pairs the chatbot engine can store [41]. There ELIZA, the textual input searches for the keywords,
are also new information retrieval approaches, such as a which are then assigned a rank. The input is transformed
method called DocChat, proposed for extracting into a “keystack,” where keywords with the highest rank
information from unstructured documents. Instead of the are at the top. The keyword of the highest rank is used to
Q-R paris, the DocChat method selects the most relevant determine the category of responses most related to the
sentence from the document directly and thus improves keyword [43]. This helps the bot determine a relevant
the fluency in chatbots’ responses. These retrieval response. PARRY has a similar structure to ELIZA but
approaches for unstructured data would aid the chatbot in uses a better controlling structure and has the ability to
developing a larger knowledge base. understand language, since it has a mental model that can
Knowledge Domain. The knowledge base can either be simulate the bot’s emotions [45].
open-domain or closed-domain (restricted domain).
Chatbots with an open-domain knowledge base are VI. MACHINE LEARNING APPROACHES FOR CHATBOT
generally conversational agents, capable of responding to DEVELOPMENT
a variety of user inputs. On the other hand, chatbots with This section discusses existing modern NLP
a closed-domain knowledge base focus on specific techniques that can be used when developing a chatbot.
domains, such as law, medicine, and programming. Recently, NLP techniques have been combined with ML,
Closed-domain chatbots are generally goal-oriented -- the as ML improves the chatbots’ performance of finding
user is likely attempting to accomplish a task, such as patterns from large amounts of data. The following
asking a question, setting an alarm, or making a sections discuss the current stages of NLP and ML
reservation. For a closed-domain question-answering techniques applied in the field of chatbot model
chatbot, it is also important to consider the size of data development.
available, the domain of interest itself, and the resources
available for the domain when determining what A. Data Preprocessing
techniques to use [42]. An arbitrary user input in human language needs to be
processed to be “clean data,” which is the data a chatbot
V. EARLY APPROACHES FOR CHATBOT DEVELOPMENT machine can understand. There are many preprocessing
This section discusses some of the earlier approaches methods currently used in chatbots, such as stopwords
used for chatbot development which do not use ML. This removal, removing capitalization, and labelling. There are
includes pattern matching and rule-based methods. three features to consider when preprocessing the data:
Lexical. The lexical feature is also called the “word
A. Pattern Matching form” feature because it focuses on each word rather than
This approach is commonly used in question- the sentence structure or grammar [46]. It uses the bag of
answering bots. They generate predefined outputs and n-grams method to group words together [47]. Three
match them with a given input according to the preprocessing methods using this lexical approach are
characteristic variables of sentences. An example of a word level n-grams, stemming, and lemmazation [44].
question answering bot is A.L.I.C.E. (Wallace, 2009), The word level n-grams method groups together n
which stands for Artificial Linguistic Internet Computer consecutive words and looks for word n-grams which
Entity [43], [44]. The following sections introduce a might indicate the category of the question (for example,
common technique used for pattern matching. if the unigram “city” is used, the question is likely asking
Artificial Intelligence Markup Language (AIML) is a for a location). Stemming reduces words to their
form of eXtensible Markup Language (XML), aimed to grammatical roots by removing suffixes [47]. This,
define rules for matching pattern and thus find the proper however, fails when words change endings in plural form
responses. An example is shown in Table II: (for example, leaf and leaves would have different stems).
Lemmazation is a more accurate approach which can

© 2020 Int. J. Mech. Eng. Rob. Res 506


International Journal of Mechanical Engineering and Robotics Research Vol. 9, No. 4, April 2020

identify the correct roots by referring to a lexical database word level, phrase level, sentence level, document level,
of English [48]. relation level, type level, and topic level. This allowed
Syntactic. Methods using syntactic principles include them to respond to utterances in addition to question-
part-of-speech (POS) tagging and chunking. POS tagging response (Q-R) pairs in which the response R is a short
[46] labels each word in a sentence with its part of speech text and only depends on the last user utterance Q. Their
(noun, pronoun, verb, etc.). Chunking is then used to method selects a sentence from given documents directly,
partition the sentence into non-overlapping, non-recursive by ranking all possible sentences based on features
segments. Each partition has a chunk tag, which is its designed at different levels of granularity. They
class label. The question classifier model then uses the compared their chatbot’s performance with a chitchat
POS tags, the surrounding context, and the class label to engine from China called XiaoIce, and found that their
identify the question type [49]. chatbot generated more formal and informative responses,
Semantic. Whereas syntax focuses on sentence whereas XiaoIce generated more colloquial responses.
structure, semantics focuses on the meaning of words. They also found that their chatbot generated either an
One method using semantic principles is Named Entity equally relevant or more relevant response than XiaoIce
Recognition (NER) [50]. Most NER implementations use in 109 out of 156 conversations, which is promising [41].
a coarse-grained hierarchical classifier consisting of a
C. Generation-Based
layered semantic hierarchy of answer types [51]. One
paper designed an open-domain question answering The generation-based methods use an encoder-decoder
chatbot using a two-layered hierarchy containing 6 coarse framework [41]. Some researchers proposed deep
classes and 50 fine classes to answer 500 questions in the learning technique using the sequence to sequence
TREC competition. They classified user questions into (seq2seq) model to predict the next sentence in a
different question types (with 98.8% accuracy), generated conversation given previous sentences using two
expected answer types, extracted keywords, and recurrent neural network (RNN), one being an encoder
reformulated questions into semantically equivalent and the other being the decoder [54]. The encoder output
questions [51]. provides necessary information to the decoder to generate
Vector Representations. Vector representation maps a sequence element by element. The decoder takes a
high dimensional word features to low dimensional sequence as the input and generates a sequence output
feature vectors. Based on certain rules and relationships, [43].
words are represented by vector coordinates. For example, Long Short-Term Memory (LSTM) networks are an
related words are closer together. The common extension of RNN which maximize the probability of
techniques for vector representation are: generating a response given the previous conversations
 word2vec [29]: Each word is represented by a [55]. A. Xu et al. [2] proposed a technique using LSTM
vector in a specified vector space containing networks to generate responses. Their process starts by
continuous bag-of-word (CBOW) and skip-gram converting the user’s input into its vector representations
(SG) architectures. with the word2vec method, and then using LTSM to learn
 doc2vec [52]: The vector representation for the mapping from sequence to sequence. It consists of
paragraphs and documents is found by taking the two LSTM neural networks. The first one is an encoder
weighted average of all the words in the document. which maps variable-length inputs to a fixed-length
 Global Vectors (GloVe) [6]: The global corpus vector. The second LSTM neural network is a decoder
statistics for the unsupervised learning of word
representations which outperform other models on
which then maps this vector to a variable-length output.
word analogy, word similarity, and named entity This is similar to the two RNN networks in the seq2seq
recognition (NER). The source code from Stanford model.
can be found at
http://nlp.stanford.edu/projects/glove/ VII. CHATBOT IN INDUSTRIES
B. Retrieval-Based In various industries, chatbots are becoming a
A group of researchers used the information retrieval ubiquitous component of customer service. The usages of
technique to tackle one of the difficult problems for chatbots in different fields are summarized in Table III
chatbots: the short text conversation. By collecting short below. They are used in Customer Relationship
conversations on social media and using them to train Management (CRM) which helps companies stay
different models, such as the translation model, latent connected to both current and potential customers for
space model (linear model), deep learning model (non- increased customer retention [56]. Both commercial and
linear model), and topic-word model, they reported that non-profit companies can improve their profitability if
the retrieval-based model can perform more they understand their users’ needs better.
“intelligently” than some of the older approaches. This
TABLE III. THE ROLE OF CHATBOTS IN VARIOUS INDUSTRIES.
model collects data from many social media sources, such
as Q&A forums, and find the differences between user Industries Description
inputs and questions online with the cosine similarities Healthcare Personalized medical assistant relies on AI
method [53]. algorithms to hold daily conversations, provide
Another research group used unstructured documents health-related information, and recommend
activities and restaurants to the elderly [33]. As
and examined their features on different levels, such as

© 2020 Int. J. Mech. Eng. Rob. Res 507


International Journal of Mechanical Engineering and Robotics Research Vol. 9, No. 4, April 2020

Industries Description user’s input, which would help the chatbot generate a
purposed by this paper [34], an LSTM model can be more accurate, relevant response.
used to extract semantic information from the However, existing chatbots have a few limitations. The
elderly’s inputs. The chatbot’s responses were main challenge for a chatbot right now is understanding
generated by Euclidean distance for matching the context in a conversation and generating a relevant
patterns. These chatbots often use frameworks
which have four layers. Data layer: record the data response. Hence, future intelligent chatbots should: 1)
processing progress and store the labeled data implement improved natural language processing
collected from multiple sensory components. techniques to accurately recognize the content of the user
Information layer: mapping on lifelong ontology, input; 2) learn to understand the context of conversations
Knowledge layer: personalized behavioral
predictions, and Service layer: the results of health and respond accordingly with emotions or personalized
service recommendation for cloud computing content. The ultimate goal of chatbots is to replicate
environments [30]. human-human interaction, which requires improved
Travel These chatbots can recommend travel plans based machine learning and natural language processing
on personal preferences from travel history that was
gathered from previous flight, hotel, and car rental techniques.
bookings. It then generates a recommendation using The current trend in chatbot development suggests that
collaborative filtering with rating scores deployed chatbots will continue to be improved with advanced
on Alexa Skills market [32]. technologies driven by ML- and NLP-based AI. As
Education Chatbots can be used to teach students basic
computer science concepts [35]. One paper previously mentioned, the Turing test, conducted by a
proposed Intelligent Tutoring Systems which are human conversational interrogator, is the most popular
computer environments which adapt to the needs of test for determining if a chatbot has achieved human-
the individual learner [36]. In particular, Open level intelligence [5], [59], The human judges ask
Learner Modelling allows the system and student to
jointly negotiate the learner model. This allows both questions to determine if one of participants is not human,
the student to reflect on their learning and the so if a chatbot can pass the test, it demonstrates human-
learner model to improve its accuracy. level communication capabilities. The goal of chatbots is
Financial Since the financial industry is increasingly to one day pass the Turing test and achieve human
deregulated, many financial transactions are now
digitized. This leaves financial businesses large
conversational capabilities, which we believe will happen.
amounts of financial and personal data to leverage
to deliver a variety of new services online [37]. For CONFLICT OF INTEREST
example, chatbots can be used to help financial
advisors and strategists with decision making based The authors declare no conflict of interest.
on previous financial transactions or trends.
AUTHOR CONTRIBUTIONS
Most conversations are held on text-based platforms
like email and online chat. An important variant on these PS and JHC designed the study; PS analyzed data and
conversational machines is the ability to think. It is why wrote the paper; XL and BW analyzed the data and co-
industries are moving towards a modern chatbot which wrote the paper; PM and JHC commented on the
uses AI technology to interact with a human more manuscript at all stages; all authors discussed the results.
intelligently. In past years, most chatbots in the industries All authors had approved the final version.
could only perform simple tasks because they are
programmed to respond to a predefined list of questions. ACKNOWLEDGMENT
In order to become self-learning chatbots, which is what This research is supported by the Petchra Pra Jom Klao
they may do in the future, they need to be trained using Doctoral scholarship, King Mongkut’s University of
data from their past conversations and update its Technology Thonburi (KMUTT), Thailand. Our thanks to
knowledge base autonomously to deliver personalized all members of Data Science and Engineering Laboratory
responses [57], [58]. (D-Lab), School of Information Technology (SIT) at
KMUTT for their useful comments and feedback.
VIII. CONCLUSIONS
In this paper, we used a mind-mapping approach to REFERENCES
present an overview of chatbots, after reviewing papers [1] R. Khan and A. Das, “Introduction to chatbots,” in Build Better
published from 1998 to 2018. This can help researchers Chatbots, Berkeley, CA: Apress, 2018, pp. 1–11.
develop a better understanding of the current [2] A. Følstad and P. B. Brandtzæg, “Chatbots and the new world of
implementation techniques and usages of chatbots. This HCI,” Interactions. ACM.org, vol. 24, no. 4, pp. 38–42, Jun.
2017.
is important because chatbots are becoming increasingly [3] A. Schlesinger, K. P. O’Hara, and A. S. Taylor, “Let’s talk about
popular, especially for customer service in the industry race: Identity, chatbots, and AI,” in Proc. the 2018 CHI
and as an intelligent virtual assistant for personal use. Conference on Human Factors in Computing Systems (CHI ’18),
This paper outlines many machine learning techniques 2018, pp. 1–14.
[4] R. Dale, “The return of the chatbots,” Natural Language
which could improve the performance of chatbots Engineering, vol. 22, no. 5, pp. 811–817, Sep. 2016.
because they allow chatbots to learn and adapt through [5] A. M. Turing, “Computing machinery and intelligence,” Mind,
experience. Having the ability to improve itself with vol. 49, pp. 433–460, 1950.
every interaction will likely improve the chatbot’s [6] P. B. Brandtzaeg and A. Følstad, “Why people use chatbots,” in
International Conference on Internet Science (INSCI 2017), 2017,
capability of understanding the content and context of the

© 2020 Int. J. Mech. Eng. Rob. Res 508


International Journal of Mechanical Engineering and Robotics Research Vol. 9, No. 4, April 2020

vol. 10673 LNCS, pp. 377–392. [26] D. Tomás and J. L. Vicedo, “Minimally supervised question
[7] M. B. Hoy, “Alexa, Siri, Cortana, and More: An introduction to classification on fine-grained taxonomies,” Knowledge and
voice assistants,” Medical Reference Services Quarterly, vol. 37, Information Systems, vol. 36, no. 2, pp. 303–334, Aug. 2013.
no. 1, pp. 81–88, Jan. 2018. [27] T. C. Zhou, M. R. Lyu, and I. King, “A classification-based
[8] M. Pinola, “History of voice recognition: from Audrey to Siri,” approach to question routing in community question answering,”
ITBusiness.ca, 2011. [Online]. Available: in Proc. of the 21st International Conference Companion on
https://www.itbusiness.ca/news/history-of-voice-recognition- World Wide Web - WWW ’12 Companion, 2012, pp. 783–790.
from-audrey-to-siri/15008. [Accessed: 12-Aug-2018]. [28] W. Zhang, T. Liu, Y. Wang, and Q. Zhu, “Neural personalized
[9] M. Saba, “A brief history of voice recognition technology,” Call response generation as domain adaptation,” Jan. 2017.
Analytics, Call Intelligence, Call Recording. [Online]. Available: [29] C. Y. Chang, S. J. Lee, and C. C. Lai, “Weighted word2vec based
https://www.callrail.com/blog/history-voice-recognition/. on the distance of words,” in Proc. of 2017 International
[Accessed: 13-Aug-2018]. Conference on Machine Learning and Cybernetics, ICMLC 2017,
[10] H. Sim, “Voice assistants: This is what the future of technology 2017, vol. 2, pp. 563–568.
looks like,” Forbes. [Online]. Available: [30] K. Chung and R. C. Park, “Chatbot-based heathcare service with
https://www.forbes.com/sites/herbertrsim/2017/11/01/voice- a knowledge base for cloud computing,” Cluster Computing, pp.
assistants-this-is-what-the-future-of-technology-looks- 1–13, 16-Mar-2018.
like/#389fc513523a. [Accessed: 13-Aug-2018]. [31] C. J. Baby, F. A. Khan, and J. N. Swathi, “Home automation
[11] Statista, “Number of internet users worldwide 2005-2017 | using IoT and a chatbot using natural language processing,” in
Statista,” The Statistics Portal, 2017. [Online]. Available: 2017 Innovations in Power and Advanced Computing
https://www.statista.com/statistics/273018/number-of-internet- Technologies (i-PACT), 2017, pp. 1–6.
users-worldwide/. [Accessed: 16-Jul-2018]. [32] A. Argal, S. Gupta, A. Modi, P. Pandey, S. Shim, and C. Choo,
[12] A. Xu, Z. Liu, Y. Guo, V. Sinha, and R. Akkiraju, “A new “Intelligent travel chatbot for predictive recommendation in echo
chatbot for customer service on social media,” in Proc. of the platform,” in 2018 IEEE 8th Annual Computing and
2017 CHI Conference on Human Factors in Computing Systems - Communication Workshop and Conference (CCWC), 2018, pp.
CHI ’17, 2017, pp. 3506–3510. 176–183.
[13] A. Galert, “Chatbot report 2018: Global trends and analysis,” [33] D. Madhu, C. J. N. Jain, E. Sebastain, S. Shaji, and A.
2017. [Online]. Available: https://chatbotsmagazine.com/chatbot- Ajayakumar, “A novel approach for medical assistance using
report-2018-global-trends-and-analysis-4d8bbe4d924b. trained chatbot,” in Proc. of the International Conference on
[Accessed: 17-Jul-2018]. Inventive Communication and Computational Technologies,
[14] Statista, “Retail e-commerce sales worldwide from 2014 to 2021 ICICCT 2017, 2017, pp. 243–246.
(in billion U.S. dollars),” Statista, 2017. [Online]. Available: [34] M. H. Su, C. H. Wu, K.-Y. Huang, Q. B. Hong, and H. M. Wang,
https://www.statista.com/statistics/379046/worldwide-retail-e- “A chatbot using LSTM-based multi-layer embedding for elderly
commerce-sales/. [Accessed: 19-Jul-2018]. care,” in 2017 International Conference on Orange Technologies
[15] K. Nimavat and T. Champaneria, “Chatbots: An overview types, (ICOT), 2017, pp. 70–74.
architecture, tools and future possibilities,” International Journal [35] L. Benotti, M. C. Martínez, and F. Schapachnik, “Engaging high
of Scientific Research and Development, vol. 5, no. 7, pp. 1019– school students using chatbots,” in Proc. the 2014 conference on
1024, Oct. 2017. Innovation & Technology in Computer Science Education -
[16] A. Deshpande, A. Shahane, D. Gadre, M. Deshpande, and P. M. ITiCSE ’14, 2014, pp. 63–68.
Joshi, “A survey of various chatbot implementation techniques,” [36] A. Kerly, P. Hall, and S. Bull, “Bringing chatbots into education:
International Journal of Computer Engineering and Applications, Towards natural language negotiation of open learner models,”
vol. XI, 2017. Knowledge-Based Systems, vol. 20, no. 2, pp. 177–185, Mar.
[17] O. S. Synekop, “Effective writing of students of technical 2007.
specialties,” Advanced Eduucation, no. 4, pp. 51–55, 2015. [37] F. Corea, “How AI Is Transforming Financial Services,” in
[18] M. C. Jenkins, R. Churchill, S. Cox, and D. Smith, “Analysis of Applied Artificial Intelligence: Where AI Can Be Used In
user interaction with service oriented chatbot systems,” in Business, Springer, Cham, 2018, pp. 11–17.
Human-Computer Interaction. HCI Intelligent Multimodal [38] M. Yan, P. Castro, P. Cheng, and V. Ishakian, “Building a
Interaction Environments, Berlin, Heidelberg: Springer Berlin Chatbot with Serverless Computing,” in Proc. the 1st
Heidelberg, 2007, pp. 76–83. International Workshop on Mashups of Things and APIs -
[19] V. Ravi and S. Kamaruddin, “Big data analytics enabled smart MOTA ’16, 2016, pp. 1–4.
financial services: Opportunities and challenges,” in [39] A. M. Rahman, A. Al Mamun, and A. Islam, “Programming
International Conference on Big Data Analytics (BDA 2017), challenges of chatbot: Current and future prospective,” in 5th
2017, vol. 10721 LNCS, pp. 15–39. IEEE Region 10 Humanitarian Technology Conference 2017,
[20] F. Halper, “Advanced analytics: Moving toward AI, machine R10-HTC 2017, 2017, vol. 2018–Janua, pp. 75–78.
learning, and natural language processing,” TDWI Best Practices [40] K. B. Reshmi S, “Empowering chatbots with business
Report, 2017. [Online]. Available: intelligence by big data integration,” International Journal of
https://www.sas.com/content/dam/SAS/en_us/doc/whitepaper2/td Advanced Research in Computer Science, vol. 9, no. 1, pp. 627–
wi-advanced-analytics-ai-ml-nlp-109090.pdf. [Accessed: 07- 631, 2018.
May-2018]. [41] Z. Yan et al., “DocChat: An information retrieval approach for
[21] S. Quarteroni, “Natural language processing for industry: chatbot engines using unstructured documents,” in Proc. of the
ELCA’s experience,” Informatik-Spektrum, vol. 41, no. 2, pp. 54th Annual Meeting of the Association for Computational
105–112, Apr. 2018. Linguistics (Volume 1: Long Papers), 2016, pp. 516–525.
[22] V. Ng, S. Dasgupta, and S. M. N. Arifin, “Examining the role of [42] D. Mollá and J. L. Vicedo, “Question answering in restricted
linguistic knowledge sources in the automatic identification and domains: an overview,” Computational Linguistics, vol. 33, no. 1,
classification of reviews,” in Proc. COLING-ACL ’06 pp. 41–61, Mar. 2007.
Proceedings of the COLING/ACL on Main conference poster [43] K. Ramesh, S. Ravishankaran, A. Joshi, and K. Chandrasekaran,
sessions, 2006, pp. 611–618. “A survey of design techniques for conversational agents,” in
[23] P. Molino, L. M. Aiello, and P. Lops, “Social question answering: Information, Communication and Computing Technology
Textual, user, and network features for best answer prediction,” (ICICCT), 2017, pp. 336–350.
ACM Transactions on Information Systems, vol. 35, no. 1, pp. 1– [44] S. Reshmi and K. Balakrishnan, “Implementation of an
40, Sep. 2016. inquisitive chatbot for database supported knowledge bases,”
[24] J. Cahn, “CHATBOT: Architecture, design, and development,” Sadhana - Academy Proceedings in Engineering Sciences, vol.
University of Pennsylvania, 2017. 41, no. 10, pp. 1173–1178, 2016.
[25] S. J. Yen, Y. C. Wu, J. C. Yang, Y. S. Lee, C. J. Lee, and J. J. Liu, [45] H. Shum, X. He, and D. Li, “From Eliza to XiaoIce: Challenges
“A support vector machine-based context-ranking model for and opportunities with social chatbots,” Frontiers of Information
question answering,” Information Sciences, vol. 224, pp. 77–87, Technology & Electronic Engineering, vol. 19, no. 1, pp. 10–26,
Mar. 2013. Jan. 2018.

© 2020 Int. J. Mech. Eng. Rob. Res 509


International Journal of Mechanical Engineering and Robotics Research Vol. 9, No. 4, April 2020

[46] J. Le, Z. Niu, and C. Zhang, “Question classification based on


Prissadang Suta, received her B.S. in
fine-grained PoS annotation of nouns and interrogative
Computer Engineering from Mae Fah Luang
pronouns,” in Pacific Rim International Conference on Artificial
University, Chiang Rai, Thailand, in 2010
Intelligence (PRICAI 2014): Trends in Artificial Intelligence,
and received her M.S. in Software
2014, pp. 680–693.
Engineering, School of Information
[47] M. Mishra, V. K. Mishra, and S. H.R., “Question classification
Technology (SIT), King Mongkut’s
using semantic, syntactic and lexical features,” International
University of Technology Thonburi
journal of Web & Semantic Technology, vol. 4, no. 3, pp. 39–47,
(KMUTT), Bangkok, Thailand, in 2015. She
2013.
is currently a Ph.D. student in Computer
[48] C. D. Manning, P. Raghaven, and H. Schuetze, “Stemming and
Science at SIT, KMUTT. Her research
Lemmatization,” in Introduction to Information Retrieval, 2009,
interested is in intelligent chatbot.
pp. 22–34.
[49] L. Zhu, L. S. Chao, D. F. Wong, and X. D. Zeng, “A noun-phrase
chunking model based on SBCB ensemble learning algorithm,”
in Proc. - International Conference on Machine Learning and Xi Lan is currently in her fourth year of
Cybernetics, 2012, vol. 1, pp. 11–16. undergraduate study in Engineering Science,
[50] D. Molla, M. Zaanen, and D. Smith, “Named entity recognition University of Toronto, Toronto, Canada. Her
for question answering,” in Proc. the Australasian Language specialization in her program is robotics
Technology Workshop 2006, 2006, pp. 51–58. engineering. Her interests are in computer
[51] X. Li and R. Dan, “Learning question classifiers,” in vision, circuit design and network security.
COLING ’02 Proc.s of the 19th international conference on
Computational linguistics, 2002, vol. 1, pp. 1–7.
[52] Q. V. Le and T. Mikolov, “Distributed representations of
sentences and documents,” in Proc. of the 31st International
Conference on Machine Learning, vol. 32, no. 2, pp. 1188–1196,
May 2014.
[53] Z. Ji, Z. Lu, and H. Li, “An information retrieval approach to Biting Wu is currently in her fourth year of
short text conversation,” pp. 1–21, Aug. 2014. undergraduate study in Engineering Science,
[54] O. Vinyals and Q. Le, “A neural conversational model,” in Proc. University of Toronto, Toronto, Canada. In
of the 31 st International Conference on Machine Learning, 2015, her program, her specialization is in
vol. 37, pp. 233–239. mathematics, statistics, and finance. She has
[55] J. Li, W. Monroe, A. Ritter, D. Jurafsky, M. Galley, and J. Gao, worked with and is interested in various
“Deep reinforcement learning for dialogue generation,” in Proc. machine learning applications, such as image
of the 2016 Conference on Empirical Methods in Natural classification, analytics, and modelling.
Language Processing, 2016, pp. 1192–1202.
[56] A. M. Seeger and A. Heinzl, “Human versus machine:
Contingency factors of anthropomorphism as a trust-inducing
design strategy for conversational agents,” in Lecture Notes in
Information Systems and Organisation, vol. 25, Springer, Cham, Pornchai Mongkolnam holds a Ph.D. in
2017, pp. 129–139. computer science from Arizona State
[57] F. Sweis, “Building and training self-learning chatbots: University and currently works at School of
Developers, you can drive the chatbot revolution,” Information Technology (SIT) at KMUTT in
ComputerWorld, 2017. [Online]. Available: Thailand. He is the Head of Data Science and
https://www.computerworld.com.au/article/631249/building- Engineering Laboratory (D-Lab) at SIT,
training-self-learning-chatbots-developers-can-drive-chatbot- KMUTT.
revolution/. [Accessed: 14-Aug-2018].
[58] L. Vishnoi, “How the development of AI has advanced the
technology available for chatbots,” Forbes Technology Council,
2018. [Online]. Available:
https://www.forbes.com/sites/forbestechcouncil/2018/05/23/how-
the-development-of-ai-has-advanced-the-technology-available- Jonathan H. Chan is an Associate Professor
for-chatbots/#2038c11fc213. [Accessed: 14-Aug-2018]. at the School of Information Technology,
[59] K. Warwick and H. Shah, “Passing the turing test does not mean KMUTT, Thailand. Jonathan received his
the end of humanity,” Cognitive Computation, vol. 8, no. 3, pp. B.A.Sc., M.A.Sc., and Ph.D. degrees from
409–419, 2016. the University of Toronto, Canada. He is a
senior member of IEEE, ACM, APNNS and
Copyright © 2020 by the authors. This is an open access article INNS. His research interests include
distributed under the Creative Commons Attribution License (CC BY- intelligent systems, biomedical informatics,
NC-ND 4.0), which permits use, distribution and reproduction in any machine learning, and data science.
medium, provided that the article is properly cited, the use is non-
commercial and no modifications or adaptations are made.

© 2020 Int. J. Mech. Eng. Rob. Res 510

You might also like