Professional Documents
Culture Documents
Chatbots: Human Language Technologies
Chatbots: Human Language Technologies
Chatbots
Slide by CY Chen
Early Approaches
ELIZA (Weizenbaum, 1966)
Used clever hand-written templates to generate
replies that resemble the user’s input
utterances
Several programming frameworks available
today for building dialog agents d (Marietto et
al., 2013, Microsoft, 2017b), Google Assistant
Templates and Rules
hand-written rules to generate replies. <category>
simple pattern matching or keyword <pattern>What is your name?</pattern>
retrieval techniques are employed to <template>My name is Alice</template>
handle the user’s input utterances. </category >
rules are used to transform a matching
pattern or a keyword into a predefined
<category>
reply.
<pattern>I like *</pattern>
<template>I too like <star/>.</template>
</category >
Open vs Closed Domain
Conversations
Closed Rule-based Machine
Domain Learning
Retrieval- Generative
Based
Responses
Two Paradigms
Retrival-based Generative
No grammatical mistakes. can refer back to entities in the input
Unable to handle unseen cases for and give the impression of talking to a
which no appropriate predefined human
response exists Hard to train
Can’t refer back to contextual entity Likely to make grammatical mistakes
information like names mentioned (especially on longer sentences)
earlier in the conversation. Typically require huge amounts of
training data
Long vs Short Conversations
The longer the conversation the more difficult to automate it.
def create_evaluation_metrics():
eval_metrics = {}
for k in [1, 2, 5, 10]:
eval_metrics[“recall_at_%d” % k] = functools.partial(
tf.contrib.metrics.streaming_sparse_recall_at_k,
k=k)
return eval_metrics
streaming_sparse_recall_at_k
f.contrib.metrics.streaming_sparse_recall_at_k(
predictions,
labels,
k,
class_id=None,
weights=None,
metrics_collections=None,
updates_collections=None,
name=None
)
Computes recall@k of the predictions with respect to sparse labels.
https://www.tensorflow.org/api_docs/python/tf/contrib/metrics/streaming_sparse_recall_at_k
Creating the Model
model_fn = udc_model.create_model_fn(
hparams=hparams,
model_impl=dual_encoder_model)
Generative Model in TensorFlow
Deep Learning for Chatbots
Deep Learning techniques can be used for both retrieval-based or
generative models, but research seems to be moving into the generative
direction
Deep Learning architectures like Sequence to Sequence are suited for
generating text and research is making rapid progress in this area.
Still at the early stages of building generative models that work
reasonably well.
Production systems are more likely to be retrieval-based for now.
Twitter bot
Query Reply
donald trump won last nights trump is a total embarrassment
presidential debate hes like a unk child who happens trump is a lot of money to get a
thought he was a joke to be a professional liar and con great country
according to snap online
man
polls
chill the fuck out thats what i said
trump campaign please
he was a joke so fun to see what the special wow that is the best thing ive
never delete this effects team created for ever seen
s new scoop is unk the she was fine too unk voice she is a good guy
trump is a disgrace to the
castro connection how the important ones are still here i dont think so
debate and the media is not
trumps company violated the
a liar been drinking pumpkin spice
us embargo against cuba protein shake every morning for a
me too i just got it
who won the first trump will be a better time week and now i literally cannot
presidential debate for a man even
just wanna live in unk i was just on the same side lmao i just lost my job and i cant
even leave early to be in the unk literally the same thing
everything is 10x better and i was like it was a good video today
there time
hey happy birthday have a nice
the lil girl i forgot her name she was so cute and she thank you
day
scares the shit out of me n was so cute and she was a
probably always will lmaooo bitch
question of the day who do hillary is a trump supporter
you think won last nights and a woman who has a
debate weigh in using better job than a woman
Cornell Movie Dialog Corpus
Query Reply
when are you going to let the you dont know what youre talking im going to miss you no youre not
police know about i dont know i dont know what i
what do you mean
you can do it yeah i think so mean
wait are you saying that i dont my god these people are insane we dont know what they are
i know you know what i mean
appreciate this isnt a date no what is it
youre terrible yeah but im not you ought to go home and take
i cant do that
nah i dont think so why not care of that
hes pissing in our faces again is something wrong no no no
i cant
and were just taking it
im sorry i dont know what else to
why would you like to be
say except im sorry
nobodys gonna get you now get
i cant go back
inside
shall i leave of course
well i really think hes got a
i know
chance
youd better be quiet sandy shut up
jesus christ you scared the shit
whats going on
out of me
well im sorry im really sorry ellie its okay
my lady this play will end badly i
lets get out of here
will tell
Implementation
basic_cell = tf.nn.rnn_cell.DropoutWrapper(
tf.nn.rnn_cell.BasicLSTMCell(emb_dim, state_is_tuple=True),
output_keep_prob=self.keep_prob)
# stack cells together: n layered model
https://ai.facebook.com/blog/state-of-the-art-open-source-chatbot
/
https://arxiv.org/pdf/2004.13637.pdf
Recipe: Scale, Blending Skills and
Generation
the best current systems train high-capacity
neural models with millions or billions of
parameters using huge text corpora
Our new recipe incorporates large-scale neural
models, with up to 9.4 billion parameters, and
also techniques for blending skills and detailed
generation.
We pretrained large Transformer neural
networks on large amounts of conversational
data (1.5 billion training examples)
Our neural networks are too large to fit on a
single device, so we utilized techniques such as
column-wise model parallelism, which allows
us to split the neural network into smaller
pieces while maintaining maximum efficiency.
The Poly-encoder Transformer Retriever
Architecture
Given a dialogue history (context) as input, retrieval systems select the next
dialogue utterance by scoring a large set (all in training) of candidate responses and
outputting the highest scoring one.
Blending Skills
Engaging use of personality (PersonaChat)
Engaging use of knowledge (Wizard of Wikipedia)
Display of empathy (Empathetic Dialogues)
Ability to blend all three seamlessly (BST)
Generation Strategies
Standard Seq2Seq Transformer architecture to generate responses