Professional Documents
Culture Documents
Dialog System A Comprehensive Understanding
Dialog System A Comprehensive Understanding
Dialog System A Comprehensive Understanding
A comprehensive understanding
Mr. T
Perception
Dialog System
Trigger
Word
Wave Sound Frequency Domain
Convoluted Neural Network Recurrent Neural Network Output
What’s else…
Speech Recognition
Speech Recognition
Speech wave
Pre-processing
Acoustic Features
v
WORD PRON (ipa)
vợ v ə ˨˩ˀ w
quê w e Decoder Acoustic Model
NGRAMsko SCORE
Acoustic Dictionary Vợ 2.5
(Pronunciation model) quê 0.7
https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf
Look up and Concatenate
No!
Pipeline for Text To Speech
Prosodic Analysis
• Prosody Structure
Voice Output • Prosody Prominence
• Tune
https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf
The state-of-the-art
king
Predication Based Vector
Words Embedding man queen
woman
One-hot Encoding Vector
1 2 3 4 5 6 7 8
Co gai 1 0 0 0 0 0 0 0
Corpus: hot girl 0 1 0 0 0 0 0 0
Co gai, hot xinh dep 0 0 1 0 0 0 0 0 each word
girl, xinh đep, Gets a 1x 8
truoc day 0 0 0 1 0 0 0 0
truoc day, la, vector
mot, chang la 0 0 0 0 1 0 0 0 representation
trai, dam my mot 0 0 0 0 0 1 0 0
chang trai 0 0 0 0 0 0 1 0
dam my 0 0 0 0 0 0 0 1
What’s wrong…
Custom Encoding Vector
ban Thoi So Nu
nguoi
chat gian dem tinh
Co gai 1 0 0 0 1
Corpus: hot girl 0.7 1 0 0 0.7
Co gai, hot each word
girl, xinh đep, xinh dep 0.6 1 0 0 0.5 Gets a 1x5
truoc day, la, truoc day 0 0 1 1 0 vector
mot, chang representation
la 0 0 0 0 0
trai, dam my
mot 0 0 0 1 0
chang trai 1 0 0 0 0
dam my 0.7 1 0 0 0
Custom Encoding Vector
ban Thoi So Nu
nguoi
chat gian dem tinh
Co gai 1 0 0 0 1
Corpus: hot girl 0.7 1 0 0 0.7
Co gai, hot each word
girl, xinh đep, xinh dep 0.6 1 0 0 0.5 Gets a 1x5
truoc day, la, truoc day 0 0 1 1 0 vector
mot, chang representation
la 0 0 0 0 0 And better
trai, dam my
mot 0 0 0 1 0 relationship
chang trai 1 0 0 0 0
dam my 0.7 1 0 0 0
Count Vector
Let us understand this using a simple example.
• D1: He is a lazy boy. She is also lazy.
• D2: Neeraj is a lazy person.
Dictionary = [‘He’, ‘She’, ‘lazy’, ‘boy’, ‘Neeraj’, ‘person’]
D=2 (# docs), N=6 (# words in the dictionary)
TF(This, Document2)=1/5
but assigns greater weight to Now, let us compare the TF-IDF for a common word ‘This’
and a word ‘Messi’ which seems to be of relevance to Document 1.
‘Messi’. TF-IDF(This,Document1) = (1/8) * (0) = 0
The big idea – Similar words tend to occur together and will
have similar context for example –
“Apple is a fruit. Mango is a fruit.”
Apple and mango tend to have a similar context i.e fruit.
P(word|context) P(context|word)
https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/
Intent and Entities
VS
Statefulness is the key
• Follow-up
• Pending action
Natural Language Generation
• Fix Response + slot filling + random from a pool
Not recommended
Future of End2End
Data Driven
• Seq2Seq
• Reinforcement
https://aclweb.org/anthology/C18-3006
Tips
• Script Writer
• Personality
• Control the dialogue
• API saves time
• Label Intent and Entities
• Design the flow
• Expandable
• Lots of testing
Applications
Database
Data warehouse
Data
Data
Logical
Functions
Speech Dialog
Text Management
Google (Rest API)
Virtual
Assistant Request Analyzed
Text Text
Supporting
Communicate
Monitoring
Training
HR staffs Employees
Company’s
Knowledge
Resource
THANK YOU!