Gmail Smart Compose

Gmail Smart Compose: Real-
Time Assisted Writing

Contents
Natural Language
01 Processing 04 Language Models
Beam Search
02 Smart Reply
05 Algorithm
Data & its

03 preprocessing 06 Personalisation
Natural Language Processing (NLP)
● Natural Language
❖ Signs
❖ Voice
❖ Text
❖ Menus
❖ Email
❖ SMS
❖ Web Page and so much more…
The list is endless.
● Google’s latest NLP development - Smart Compose

● Most important part of NLP - Language Modelling
Smart Reply
➢ Generates automatic response suggestions
once for a mail.
➢ Constrained to small human-curated whitelist

with limited context-dependency, in contrast
Smart Compose suggestions are more open-
ended, flexible.
➢ Smart Compose - Multilingual & Personalised.

Gmail Smart Compose
Challenges
➢ Latency - Suggestions appear as the user is typing, minimizing end-to-end latency is

critical. Model inference needs to be performed on almost every keystroke.
➢ Scale - 1.5 billion diverse users. The model needs to have enough capacity so that
it is able to make tailored, high-quality suggestions in subtly different contexts.
➢ Personalisation - Users often have their own unique email writing styles
➢ Fairness and Privacy - Needed to address sources of potential bias in the training
process,
Fairness and Privacy
I am meeting an investor next week. Did you want to meet him?
I am meeting a nurse next week. Did you want to meet her?

Data for the Model
● Previous e-mail in case the
composed e-mail was a
Mail being written
Contextual response.
Information ● Subject of the e-mail.
● Date and time of the
composed e-mail.
● Locale of the user
composing the e-mail.
Date & Time - For suggesting “Good Morning” , “Happy New Year” ..etc
Locale - To distinguish between spellings.

● Language Detection
● Segmentation/Tokenization
Preprocessing the Data ● Normalisation
● Quotation Removal
● Salutation/Close Removal
Language Model - Introduction
Model- Mathematical Representation of real world process.
Definition - Model to find the probability of the next word or sequence of words.
Yt = f (Yt-1 )
Objective - To maximise the log probability of producing the correct target sequence.
Types - Statistical models and Neural Network models.

Typical language generation models are ngram, neural bag-of-words (BoW)
and RNN language (RNN-LM) models.
Recurrent Neural Network Model
hi - Hidden Layer
yi - Input
Wi - Weight
xi - output
Sequence-to-Sequence model
3
BoW(Bag of Words) Model
It was the best of times,
it was the worst of times,
it was the age of wisdom,
it was the age of foolishness
"it was the worst of times" = [1, 1, 1, 0, 1, 1, 1, 0, 0, 0]

"it was the age of wisdom" = [1, 1, 1, 0, 1, 0, 0, 1, 1, 0]
"it was the age of foolishness" = [1, 1, 1, 0, 1, 0, 0, 1, 0, 1]
Model Used
BoW RNN
● Sacrificed prediction quality

● Less Latency
Model Architecture
Searching the best possibility
● Greedy method - Straightway method

○ Picking the word with the maximum probability
○ It can often lead to sub-optimal output sequences.
● Beam Search - Optimization of BFS

○ Less memory consumption
○ At each level of the subtree, only promising neighbours are expanded.
○ 𝛃 or Beam width is the no of nodes visited/stored at a level.
○ Heap of m best candidates maintained.
○ Each candidate sequence is considered complete when a sentence punctuation token or a
special end-of-sequence (<EOS>) token is generated, or when the candidate reaches a
predefined maximum output sequence length.
Beam Search Algorithm - Decoder
OPEN = {initial state}
while OPEN is not empty do
1. Remove the best node from OPEN, call it n.
2. If n is the goal state, backtrace path to n (through recorded parents) and return path.
3. Create n's successors.
4. Evaluate each successor, add it to OPEN, and record its parent.
5. If |OPEN| > ß , take the best ß nodes (according to heuristic) and remove the others from the OPEN.
done
Train the
model
Personalisation
● A simple small language for each gmail user.

● Global and Personal models are blended
Pfinal = αPpersonal + (1 − α)Pglobal
α - Constant Linear Interpolation Constant

Conclusion
Each Smart Compose request is composed of a sequence of prefix encoding

steps and beam-search steps. Each of these steps involves computing a single
time-step inference of the language model.
During the prefix-encoding phase, the goal is to obtain the final hidden states
of the language model, which represent the full encoding of the prefix. The
beam-search phase then explores hypothetical extensions of the prefix,
keeping track of plausible candidate suggestions.
1. Gmail Smart Compose: 1. https://arxiv.org/pdf/1906.0
Real-Time Assisted 0080.pdf
Writing
1. https://ai.googleblog.com/2
1. Google AI Blog 018/05/smart-compose-
using-neural-networks-
to.html
1. Beam Search Strategy 1. https://hackernoon.com/bea

m-search-a-search-
strategy-5d92fb7817f

Gmail Smart Compose

Uploaded by

Copyright:

Available Formats

You might also like

Gmail Smart Compose

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Gmail Smart Compose

Uploaded by

Copyright:

Available Formats

Gmail Smart Compose: Real-

Time Assisted Writing

Data & its

● Google’s latest NLP development - Smart Compose

➢ Constrained to small human-curated whitelist

➢ Smart Compose - Multilingual & Personalised.

➢ Latency - Suggestions appear as the user is typing, minimizing end-to-end latency is

I am meeting an investor next week. Did you want to meet him?

I am meeting a nurse next week. Did you want to meet her?

Locale - To distinguish between spellings.

Model- Mathematical Representation of real world process.

Types - Statistical models and Neural Network models.

"it was the worst of times" = [1, 1, 1, 0, 1, 1, 1, 0, 0, 0]

● Sacrificed prediction quality

● Greedy method - Straightway method

● Beam Search - Optimization of BFS

while OPEN is not empty do

1. Remove the best node from OPEN, call it n.

3. Create n's successors.

4. Evaluate each successor, add it to OPEN, and record its parent.

● A simple small language for each gmail user.

Pfinal = αPpersonal + (1 − α)Pglobal

α - Constant Linear Interpolation Constant

Each Smart Compose request is composed of a sequence of prefix encoding

1. Beam Search Strategy 1. https://hackernoon.com/bea

You might also like