Professional Documents
Culture Documents
6 Technical Approaches For Building Conversational AI - TOPBOTS
6 Technical Approaches For Building Conversational AI - TOPBOTS
6 Technical Approaches For Building Conversational AI - TOPBOTS
FOR BUILDING
CONVERSATIONAL AI
If you were using a modern graphical user interface (GUI), you would:
1. Go to your computer
2. Open up a browser
3. Type in Amazon, then type “toilet paper” into the search window
5. Make a choice but then be confronted with more choices over how
many packs to get
6. Sign in if you haven’t already
Or, you can avoid the struggle and just tell your Amazon Echo to order
you some toilet paper.
One of the rst decisions that you’d need to make is how your bot will
process dialogue inputs and produce replies (each armed with a
potentially di erent approach to NLP and NLU). Most current
production systems used rule-based or retrieval-based methods, while
generative methods, grounded learning, and interactive learning are
active areas of research.
1. RULE-BASED
Rule-based systems are trained on a prede ned hierarchy of rules that
govern how to transform user input into output dialogue or actions.
Rules can range from simple to complex, and a rule-based system is
relatively straightforward to create. However, these systems aren’t able
to respond to input patterns or keywords that don’t match existing
rules.
Remember Microsoft DOS and how painful it was to use? MS-DOS and
other terminal interfaces are actually examples of rule-based
conversational interface. Though the user has to learn a terse and
di cult-to-learn array of commands, the system responds in a
predictable manner if the user provides the correct command. As older
users may recall that MS-DOS o ered no error-handling; if the
commands and associated syntax weren’t entered exactly as directed,
then the system simply threw an error message and did nothing.
Rule-based conversational systems don’t have to suck. Eliza, an MIT
chatbot created in the 1960s, fooled many users into thinking that it
was a real therapist with its sophisticated rule-based dialogue
generation. Eliza rst scanned the input text for keywords, assigned
each keyword a programmer-designated rank, decomposed and
reassembled the input sentence based on the highest-ranking keyword,
and if it encountered remarks that didn’t match any known keyword,
prompted the user to provide more input (“Tell me more about that”).
Apparently that was enough to make some people think that Eliza was
a better listener than their human acquaintances!
2. RETRIEVAL-BASED
Retrieval-based methods power the bulk of production systems in use
today.
When given user input, the system uses heuristics to locate the best
response from its database of pre-de ned responses. Dialogue
selection is essentially a prediction problem, and using heuristics to
identify the most appropriate response template may involve simple
algorithms like keywords matching or it may require more complex
processing with machine learning or deep learning. Regardless of the
heuristic used, these systems only regurgitate pre-de ned responses
and do not generate new output.
Retrieval-based systems need a lot of data pre-processing for their data
and custom application logic. For example, the original IBM Watson was
built for the sole purpose of competing on Jeopardy!, and it had
sophisticated modules to preprocess questions, generate answers, and
score hypotheses.
3. GENERATIVE METHODS
Overcoming the limitations of the previous two approaches requires
that the conversational AI be smart enough and creative enough to
generate new content. Instead of drawing upon pre-de ned responses,
conversational AI that use generative methods is given a large amount
of conversational training data in order to learn how to generate new
dialogue that resembles it.
While adversarial methods have worked well for images (such as with
the use of GANs, generative adversarial networks), they aren’t as
productive for use in dialog systems. Unlike pixel values, words are
discrete and cannot be in nitesimally perturbed.
4. ENSEMBLE METHODS
Recent, state-of-the-art conversational AI such as Alexa prize bots,
which were designed to be conversational bots that could talk about
any subject (a very di cult problem!), have been built with ensemble
methods, which use some combination of rule-based, retrieval-based,
and generative method approaches as dictated by context.
5. GROUNDED LEARNING
Human dialogue relies extensively on context and external knowledge.
For example, if you told a chatbot that you were going to the Swan
Oyster Depot, that chatbot would probably recognize Swan Oyster
Depot as a restaurant, possibly a seafood restaurant, and it may tell
you to have a good time. Telling a local may result in a
recommendation for the Sicilian sashimi, but telling someone who
watches a lot of CNN may instead get you a monologue about Anthony
Bourdain’s fervent love of the place.
6. INTERACTIVE LEARNING
Language is inherently interactive. Humans use language to facilitate
cooperation when they needed to solve problem together, and
practical needs in uences how language continues to develop.
In SHRDLURN, the human operator knows the desired goal of the game
but has no director control over the game pieces; the computer has
control but does not understand language. The human player’s goal is
to iteratively instruct the computer to map language to concepts until it
can perform the correct actions to complete the task.
As it turns out, the actual language that human players used to teach
the computer turned out to be less important than their ability to issue
clear and consistent commands. Based on his experiences with
SHRDLURN, Liang observed, “How do we represent knowledge, context,
memory? Maybe we shouldn’t be focused on creating better models,
but rather better environments for interactive learning.”