Professional Documents
Culture Documents
Natural Language Processing
Natural Language Processing
In practice, NLU is used to mean NLP. The understanding by computers of the structure and
meaning of all human languages, allowing developers and users to interact with computers using
natural sentences and communication. Computational linguistics (CL) is the scientific field that
studies computational aspects of human language, while NLP is the engineering discipline
concerned with building computational artifacts that understand, generate, or manipulate human
language.
Research on NLP began shortly after the invention of digital computers in the 1950s, and NLP
draws on both linguistics and AI. However, the major breakthroughs of the past few years have
been powered by machine learning, which is a branch of AI that develops systems that learn and
generalize from data. Deep learning is a kind of machine learning that can learn very complex
patterns from large datasets, which means that it is ideally suited to learning the complexities of
natural language from datasets sourced from the web.
Healthcare: As healthcare systems all over the world move to electronic medical
records, they are encountering large amounts of unstructured data. NLP can be used to
analyze and gain new insights into health records.
Legal: To prepare for a case, lawyers must often spend hours examining large
collections of documents and searching for material relevant to a specific case. NLP
technology can automate the process of legal discovery, cutting down on both time
and human error by sifting through large volumes of documents.
Finance: The financial world moves extremely fast, and any competitive advantage is
important. In the financial field, traders use NLP technology to automatically mine
information from corporate documents and news releases to extract information
relevant to their portfolios and trading decisions.
Customer service: Many large companies are using virtual assistants or chatbots to
help answer basic customer inquiries and information requests (such as FAQs),
passing on complex questions to humans when necessary.
Insurance: Large insurance companies are using NLP to sift through documents and
reports related to claims, in an effort to streamline the way business gets done.
Another kind of model is used to recognize and classify entities in documents. For each word in
a document, the model predicts whether that word is part of an entity mention, and if so, what
kind of entity is involved. For example, in “XYZ Corp shares traded for $28 yesterday”, “XYZ
Corp” is a company entity, “$28” is a currency amount, and “yesterday” is a date. The training
data for entity recognition is a collection of texts, where each word is labeled with the kinds of
entities the word refers to. This kind of model, which produces a label for each word in the input,
is called a sequence labeling model.
Sequence to sequence models are a very recent addition to the family of models used in NLP. A
sequence to sequence (or seq2seq) model takes an entire sentence or document as input (as in a
document classifier) but it produces a sentence or some other sequence (for example, a computer
program) as output. (A document classifier only produces a single symbol as output). Example
applications of seq2seq models include machine translation, which for example, takes an English
sentence as input and returns its French sentence as output; document summarization (where the
output is a summary of the input); and semantic parsing (where the input is a query or request in
English, and the output is a computer program implementing that request).
Deep learning, pretrained models, and transfer learning: Deep learning is the most widely-
used kind of machine learning in NLP. In the 1980s, researchers developed neural networks, in
which a large number of primitive machine learning models are combined into a single network:
by analogy with brains, the simple machine learning models are sometimes called “neurons.”
These neurons are arranged in layers, and a deep neural network is one with many layers. Deep
learning is machine learning using deep neural network models.
Because of their complexity, generally it takes a lot of data to train a deep neural network, and
processing it takes a lot of compute power and time. Modern deep neural network NLP models
are trained from a diverse array of sources, such as all of Wikipedia and data scraped from the
web. The training data might be on the order of 10 GB or more in size, and it might take a week
or more on a high-performance cluster to train the deep neural network. (Researchers find that
training even deeper models from even larger datasets have even higher performance, so
currently there is a race to train bigger and bigger models from larger and larger datasets).
The voracious data and compute requirements of Deep Neural Networks would seem to severely
limit their usefulness. However, transfer learning enables a trained deep neural network to be
further trained to achieve a new task with much less training data and compute effort. The
simplest kind of transfer learning is called fine tuning. It consists simply of first training the
model on a large generic dataset (for example, Wikipedia) and then further training (“fine-
tuning”) the model on a much smaller task-specific dataset that is labeled with the actual target
task. Perhaps surprisingly, the fine-tuning datasets can be extremely small, maybe containing
only hundreds or even tens of training examples, and fine-tuning training only requires minutes
on a single CPU. Transfer learning makes it easy to deploy deep learning models throughout the
enterprise.