Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 50

Recent Advances in

Natural Language
Processing
Seth Grimes
Alta Plana Corporation
@SethGrimes – grimes@altaplana.com

November 16, 2021


2019 & 2020
tedcomd.com
meetup.com/NY-NLP
Disclaimer
I use A LOT of commercial product materials in the
slides that follow. These are illustrations and not
recommendations, and I have no financial interest in
the companies (unless disclosed).
Natural Language Processing
Natural Language Understanding (NLU)
• OCR, language detection, tokenization, parsing
• Information extraction: parts of speech, chunks , entities,
aspects, topics/themes, relations, attributes, events, intent …
• Speech processing: verbal and nonverbal
Natural Language Generation (NLG)
NLU + NLG together, for example:
• Summarization
• Machine translation
• Conversational interfaces
• Question answering
Functions

https://gradientflow.com/2020nlpsurvey/
Empirical Methods in Natural Language Processing (EMNLP2020)
Explore EMNLP21
Early Days (1958)
Transcribing
Encoding
Abstracting

Who needs to know?


Who knows what?
What is known?

Hans Peter Luhn


“A Business Intelligence System”
IBM Journal, October 1958
“Statistical
information
derived from word
frequency and
distribution is
used by the
machine to
compute a
relative measure
of significance,
first for
individual words
and then for
sentences.
“All models are wrong, but some are useful.”
-- George Box
+17 years

https://en.wikipedia.org/wiki/Document-term_matrix
Skipping Over a Lot of Stuff…
Rules
Taxonomies & ontologies
Booleans
Statistical models, especially cooccurrence
Sequence models: RNNs & LSTM

Word2Vec (2013)

https://code.google.com/p/word2vec/

“You shall know a


word by the
company it
keeps.”
– J.R. Firth, 1957
https://developers.google.com/machine-learning/crash-course/embeddings/translating-to-a-lower-dimensional-space
Word2Vec: Key Concepts
Continuous bag-of-
words (CBOW)
predicts a word from
a window of
surrounding words.

Skip-gram uses a
word to predict a
window of
surrounding words.
Doc2Vec (2014)

https://arxiv.org/abs/1405.4053
Sense2Vec (2015)

“Sense2vec 
(Trask et. al, 2015) is
a new twist on
word2vec that lets
you learn more
interesting, detailed
and context-sensitive
word vectors.”

https://arxiv.org/abs/1511.06388
Encoder-
Decoder
Architecture

Here, machine
translation:

https://leonoverweel.com/projects/2019/nlu-coursework/
2020:
Transformers (2017)

https://arxiv.org/pdf/1910.03771.pdf

https://arxiv.org/abs/1706.03762
BERT (2018)

https://arxiv.org/pdf/1910.03771.pdf

https://arxiv.org/abs/1810.04805
Transfer Learning

https://pennylane.ai/qml/demos/tutorial_quantum_transfer_learning.html
Transfer Learning

https://pennylane.ai/qml/demos/tutorial_quantum_transfer_learning.html
https://pair-code.github.io/lit/
Back To The Garden
NLP Libraries
https://blog.rasa.com/rasa-nlu-in-depth-part-1-intent-classification/
Hugging Face
Model Hub
Hugging Face Pipeline Example
Hugging Face Pipeline Examples
Cloud Services
Amazon Comprehend Medical

“With a simple API call to Amazon Comprehend Medical you can quickly and accurately
extract information such as medical conditions, medications, dosages, tests, treatments and
procedures, and protected health information while retaining the context of the information.
Amazon Comprehend Medical can identify the relationships among the extracted
information to help you build applications for use cases like population health analytics,
clinical trial management, pharmacovigilance, and summarization. You can also use Amazon
Comprehend Medical to link the extracted information to medical ontologies...”
https://aws.amazon.com/comprehend/medical/
AWS Comprehend:
Ontology Linking

https://aws.amazon.com/blogs/aws/new-amazon-comprehend-medical-adds-ontology-linking/
Services and Solutions: Examples
https://www.qualtrics.com/experience-management/research/text-analysis/
Conversation / Analytics
(2018)

https://blog.rasa.com/conversational-ai-your-guide-to-five-levels-of-ai-assistants-in-enterprise/
Voice conversation analytics
Keep Up With NLP Developments
https://www.language-technology.com/twin https://newsletter.ruder.io/
Recent Advances in
Natural Language
Processing
Seth Grimes
Alta Plana Corporation
@SethGrimes – grimes@altaplana.com

November 16, 2021

You might also like