Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

UNIT V NATUR AL LANGUAGE PROCESSING

Language models - Phrase Structure Grammars - Syntactic Analysis – Augmented Grammars


and Semantic Interpretation - Application with NLP: Developing a Simple Chatbot - Types of
Chatbot.
5.1 LANGUAGE MODELS
A LANGUAGE can be defined as a set of strings; “print(2 + 2)” is a legal program in the language
Py thon, whereas “2)+(2 print” is not .
Language is specified by a set of rules called a grammar.
Formal languages also have rules that define the meaning or semantics of a program .
Example: “meaning” of “2+2” is 4, and the meaning of “1/0” is that an error is signaled .
Natural languages are also ambiguous
Example: “The man saw the girl with the telescope”
Natural Languages are difficult t o deal with because they are ver y large, and constantly
changing.
N-gram character models:
• A written text is composed of characters —letters, digits, punctuation, and spaces in
English. one of the simplest language models is a probability distribution over sequences
of characters.
P(C 1 : N ) for the probability of a sequence of N characters, C 1 through C N . In one Web collection,
P(“the”)=0.027 and P(“zgq”)=0.000000002 .
A sequence of written symbols of length n is called an n -gram. A model of the probabilit y
distribution of n -letter sequences is thus called an n -gram model
P(w|h), the probability of a word w given some histor y h . Suppose the histor y h is “its water
is so transparent that” and we want to know the probability that the ne xt word is the:
5.2 PHR ASE STRUCTURED GR AMMARS
A Grammar is a collection of rules that defines a language as a set of allowable strings of
words.
Rules for allowable characters, words and sentences.
Lexical Categor y (also known as a par t of speech) such as noun or adjective (Categor y of
words) [ words / lexicon ]
String together lexical categories to form Syntactic Categories such as noun phrase or verb
phrase [ how to construct a sentence by using words ]
Combine these syntactic categories into trees representing the Phrase Structure of sentences:
nested phrases, each marked with a categor y
Grammars are represented using V,T,P,S
V → Non-Terminal
T → Terminal
P → Production Rules
S → Star t Symbol
FOUR CLASSES OF GR AMMAR FORMALISM
Grammatical formalisms can be classified by their generative capacity: the set of languages
they can represent
• Recursively Enumerable – Turing Machine
No Restriction
Both sides any number of non -terminals and terminals
Example: ABC → DE
• Context Sensitive Grammar – Turing Machine
Right hand side must contain at least as many symbols as the left hand side
AXB→AYB
• Context Free Grammar – PDAs
Left hand side must contain single Non -Terminal right hand side may be in any context
Most popular for natural language and programing language grammars
Example: A → aAb[.90] | (Epsilon) [.10]
Language: Equal number of a followed by same number of b
V – {A}
T – {a,b}
P – {A→aAb,A→}
S – {A}
A → aAb → aaAbb → aaaAbbb → aaabbb
W=aaaabbbb
A → aAb
→ aaAbb
→ aaaAbbb
→ aaaaAbbbb
→ aaaabbbb
W=aabb

• Regular Grammar – Finite State Machine


Single non-terminal on the left hand side and a terminal symbol optionally followed by a
nonterminal on the right hand side
Example: A → Aa | Bb
PROBABILISTIC CONTEXT FREE GR AMMAR:
• “probabilistic” means that the grammar assigns a probability to ever y string
VP → Verb [0.70] | VP NP [0.30]
• Here VP (verb phrase) and NP (noun phrase) are non -terminal symbols. The grammar
also refers to actual words, which are called terminal symbols.
• This rule is saying that with probability 0.70 a verb phrase consists solely of a v erb, and
with probability 0.30 it is a VP followed by an NP.
THE LEXICON OF 0
List of allowable words
The words are grouped into the lexical categories familiar to dictionar y user s
nouns, pronouns, and names to denote things
verbs to denote events
adjecti ves to modify nouns
adverbs to modify verbs
function words: ar ticles (such as the), prepositions (in), and conjunctions (and).
Example:
LEXICON FOR THE LANGUAGE 0
Noun → stench [0.05] | breeze [0.10] | wumpus [0.15] | pits [0.05] | ...
Verb → is [0.10] | feel [0.10] | smells [0.10] | stinks [0.05] | ...
Adjective → right [0.10] | dead [0.05] | smelly [0.02] | breezy [0.02] ...
Adverb → here [0.05] | ahead [0.05] | near by [0.02] | ...
Pronoun → me [0.10] | you [0.03] | I [0.10] | it [0.10] | ...
RelPro → that [0.40] | which [0.15] | who [0.20] | whom [0.02] ∨ ...
Name → John [0.01] | Mar y [0.01] | Boston [0.01] | ...
Ar ticle → the [0.40] | a [0.30] | an [0.10] | ever y [0.05] | ...
Prep → to [0.20] | in [0.10] | on [0.05] | near [0.10] | ...
Conj → and [0.50] | or [0.10] | but [0.20] | yet [0.02] ∨ ...
Digit → 0 [0.20] | 1 [0.20] | 2 [0.20] | 3 [0.20] | 4 [0.20] | ...
Each of the categories ends in ... to indicate that there are other words in the categor y
OPEN CLASSES: nouns, names, verbs, adjectives, and adverbs
CLOSED CLASSES: pronoun, relative pronoun, ar ticle, preposition, and conjunction
THE LEXICON OF 0
Grammar is required to combine the words into phrases .
The syntactic categories are sentence (S), noun phrase (NP), verb phrase (VP), list of
adjectives (Adjs), prepositional phrase (PP), and relative clause (RelClause).
0 : S → NP VP [0.90] I + feel a breeze |
S Conj S [0.10] I feel a breeze + and + It stinks
NP → Pronoun [0.30] I |
Name [0.10] John |
Noun [0.10] pits |
Ar ticle Noun [0.25] the + wumpus |
Ar ticle Adjs Noun [0.05] the + smelly dead + wumpus |
Digit Digit [0.05] 3 4 |
NP PP [0.10] the wumpus + in 1 3 |
NP RelClause [0.05] the wumpus + that is smelly
VP → Verb [0.40] stinks |
VP NP [0.35] feel + a breeze |
VP Adjective [0.05] smells + dead |
VP PP [0.10] is + in 1 3 |
VP Adverb [0.10] go + ahead
Adjs → Adjective [0.80] smelly |
Adjective Adjs [0.20] smelly + dead
PP → Prep NP [1.00] to + the east
RelClause → RelPro VP [1.00] that + is smelly
PARSE TREE:
• The parse tree gives a constructive proof that the string of words is indeed a sentence
according to the rules of 0
Example: Ever y Wumpus Smells

• Each interior node of the tree is labeled with its probability.


• The probabilit y of the tree as a whole is 0.9 × 0.25× 0.05× 0.15 × 0.40 × 0.10 =
0.0000675.
• Since this tree is the only parse of the sentence, that number is also the probabilit y of
the sentence.
• The tree can also be written in linear form as
[S [NP [Ar ticle ever y] [Noun wumpus]][VP [Verb smells]]].
The grammar OVERGENER ATES: that is, it generates sentences that are not grammatical
It also UNDERGENER ATES: there are many sentences of English t hat it rejects
5.3 SYNTACTIC ANALYSIS (PARSING)
Parsing is the process of analyzing a string of words to uncover its phrase structure,
according to the rules of a grammar.
Example: Trace of the process of finding a parse for the string “The wumpus is dead”
TOP-DOWN PARSING: The parsing star ts from the star t symbol and transform it into the
input symbol.
Left Most Derivation: Left most non terminal replaced first
S
S → NP VP
S → Ar ticle Noun VP
S → The Noun VP
S → The Wumpus VP
S → The Wumpus VP Adjective
S → The Wumpus Verb Adjective
S → The Wumpus is Adjective
S → The Wumpus is dead
Right Most Derivation: Right most non terminal replaced first
S
S → NP VP
S → NP VP Adjective
S → NP VP dead
S → NP Verb dead
S → NP is dead
S → Ar ticle Noun is dead
S → Ar ticle Wumpus is dead
S → The Wumpus is dead

Bottom up Parsing: the parsing star ts with the input symbol and construct the parse tree up
to the star t symbol.

The Wumpus is dead


Ar ticle Wumpus is dead
Ar ticle Noun is dead
NP is dead
NP Verb dead
NP VP dead
NP VP Adjective
NP VP
S

• Both top-down and bottom - up parsing can be inefficient, however, because they can
end up repeating effor t in areas of the search space that lead to dead ends. Consider
the following two s entences:
Have the students in section 2 of Computer Science 101 take the exam.
Have the students in section 2 of Computer Science 101 taken the exam?
• Even though they share the first 10 words, these sentences have ver y different parses,
because the first is a command and the second is a question.
• A left-to-right parsing algorithm would have to guess whether the first word is par t of
a command or a question and will not be able to tell if the guess is correct until at
least the eleventh word, take or taken
• If the algorithm guesses wrong, it will have to backtrack all the way to the first word
and reanalyze the whole sentence under the other interpret ation.
DYNAMIC PROGR AMMING:
Ever y time we analyze a substring, store the results so we won’t have to reanalyze it later .
we can record that result in a data structure known as a char t. Algorithms that do this are
called char t parsers .
There are many types of char t parsers.
Example: CYK algorithm (John Cocke, Daniel Younger, and Tadeo Kasami )
Chomsky Normal Form :
NT → NT NT
NT → T
Deciding Membership Using CYK Algorithm
To decide the membership of any given string, we construct a triangular table where -
• Each row of the table corresponds to one par ticular length of sub strings.
• Bottom most row corresponds to strings of length -1.
• Second row from bottom corresponds to strings of length -2.
• Third row from bottom corresponds to strings of length -3.
• Top most row from bottom corresponds to the given string of length -n.
Notations:
xij
x i j represents a sub string of “x” star ting from location ‘i’ and has length ‘j’.
Number of sub strings possible = n(n+1)/2 = 4 x (4+1) / 2 = 10
Example x = abcd
a b c d
1 2 3 4
• x11 = a x21 = b x31 = c x41 = d
• x 1 2 = ab x 2 2 = bc x 3 2 = cd
• x 1 3 = abc x 2 3 = bcd
• x 1 4 = abcd
Vij
v i j represents a set of variables in the grammar which can derive the sub string x i j .
If the set of variables consists of the star t symbol, then it becomes sure -
• Sub string x i j can be derived from the given grammar.
• Sub string x i j is a member of the language of the given grammar.
Example Problem:
Consider the Grammar given below check the acceptance of string w = baaba using CYK
Algorithm
S → AB / BC
A → BA / a
B → CC / b
C → AB / a
Solution:
The given grammar is in Chomsky Normal Form. So no need to conver t
5 V15
4 V14 V24
3 V13 V23 V33
2 V12 V22 V32 V42
1 V11 V21 V31 V41 V51
1 2 3 4 5

Row 1:
V11 represents the set of variables deriving X11
X11=b
V11={B}
V21 represents the set of variables deriving X21
X21=a
V21={A,C}
V31 represents the set of variables deriving X31
X31=a
V31={A,C}
V41 represents the set of variables deriving X41
X41=b
V41={B}
V51 represents the set of variables deriving X51
X51=a
V51={A,C}
Row 2:
As per the algorithm, to find the value of V i j from 2 n d row on wards,
we use the formula -

V ij = V ik V (i+ k ) ( j -k )
where k varies from 1 to j -1
V12
i=1,j=2,k=1
V12=V11.V21
V12={B}{A,C}
V12={BA,BC}
V12={A,S}
V22
i=2,j=2,k=1
V22=V21.V31
V22={A,C}{A,C}
V22={AA,AC,CA,CC}
Since AA , AC and CA do not exist, so we have -
V22={B}
V32
i=3,j=2,k=1
V32=V31.V41
V32={A,C}{B}
V32={AB,CB}
V32={AB}
V32={S,C}
V42
i=4,j=2,k=1
V42=V41.V51
V42={B}{A,C}
V42={BA,BC}
V42={A,C}

Row 3:
V13
i=1,j=3,k=1,2
V13=V11.V22 U V12.V31
V13= { B } { B } ∪ { A , S } { A , C }
V13 = { BB } ∪ { AA , AC , SA , SC }
V13= ϕ
V23
i=2,j=3,k=1,2
V23=V21.V32 U V22.V41
V23= { A , C } { S , C } ∪ { B } { B }
V23= { AS , AC , CS , CC } ∪ { BB }
V23= { CC }
V23 = B
V33
i=3,j=3,k=1,2
V33=V31.V42 U V32.V51
V33= { A , C } { A , S } ∪ { S , C } { A , C }
V33= { AA , AS , CA , CS } ∪ { SA , SC , CA , CC }
V33= ϕ ∪ { CC }
V33= ϕ ∪ { B }
V33 = { B }

Row 4:
V14
i=1,j=4,k=1,2,3
V14=V11.V23 U V12.V32 U V13.V41
V14= { B } { B } ∪ { A , S } { S , C } ∪ { ϕ , B }
V14= { BB } ∪ { AS , AC , SS , SC } ∪ { B }
Since BB , AS , AC , SS , SC and B do not exist, so we have -
V14= ϕ ∪ ϕ ∪ ϕ
V14 = ϕ
V24
i=2,j=4,k=1,2,3
V24= V 2 1 . V 3 3 ∪ V 2 2 . V 4 2 ∪ V 2 3 . V 5 1
V24= { A , C } { B } ∪ { B } { A , S } ∪ { B } { A , C }
V24= { AB , CB } ∪ { BA , BS } ∪ { BA , BC }
Since CB does not exist, so we have -
V24= { AB } ∪ { BA , BS } ∪ { BA , BC }
V24= { S , C } ∪ { A } ∪ { A , S }
V24= { S , C , A }
ROW 5:
V15
i=1,j=5,k=1,2,3,4
V15= V 1 1 . V 2 4 ∪ V 1 2 . V 3 3 ∪ V 1 3 . V 4 2 ∪ V 1 4 . V 5 1
V15= { B } { S , C , A } ∪ { A , S } { B } ∪ { ϕ } { A , S } ∪ { ϕ } { A , C }
V15= { BS , BC , BA } ∪ { AB , SB } ∪ { A , S } ∪ { A , C }
Since BS , SB , A , S and C do not exist, so we have -
V15= { BC , BA } ∪ { AB } ∪ ϕ ∪ ϕ
V15= { S , A } ∪ { S , C } ∪ ϕ ∪ ϕ
V15 = { S , A , C }
5 {S,A,C}
4 {ϕ} {S,A,C}
3 {ϕ} {B} {B}
2 {S,A} {B} {S,C} {S,A}
1 {B} {A,C} {A,C} {B} {A,C}
1 2 3 4 5

• There exists total 4 distinct sub strings which are members of the language of given
grammar.
• These 4 sub strings are ba, ab, aaba, baaba.
• This is because they contain star t symbol in their respective cell.
• Strings which cannot be derived from any variable are baa, baab.
• This is because they contain ϕ in their respective cell.
• Strings which can be derived from variable B alone are b, aa, aba, aab.
• This is because they contain variable B alone in their respective cell.
5.4 AUGMENTED GR AMMAR:
• lexicalized PCFG, in which the probabilities for a rule depend on the relationship
between words in the parse tree.
Example: To get at the relationship between the verb “eat” and the nouns “banana” versus
“bandanna,”
• we can’t have the probability depend on ever y word in the tree, because we won’t have
enough training data to estimate all those probabilities.
• It is useful to introduce the notion of the head of a phrase —the most impor tant word.
Thus, “eat” is the head of the VP “eat a ban ana” and “banana” is the head of the NP “a
banana.”
VP(v) to denote a phrase with categor y VP whose head word is v .
• We say that the categor y VP is augmented with the head variable v
Here is an augmented grammar that describes the verb –object relation:
VP(v) → Verb(v) NP(n) [P1(v, n)]
VP(v) → Verb(v) [P2(v)]
NP(n) → Ar ticle(a) Adjs( j) Noun(n) [P3(n, a)]
Noun(banana) → banana [pn]
We would set this probability to be relatively high when v is “eat” and n is “banana,” and low
when n is “bandanna.”

DEFINITE CLAUSE GR AMMAR:


• An augmented rule can be translated into a logical sentence
NP(n) → Ar ticle(a) Adjs( j) Noun(n) {Compatible( j, n)} .
• The new aspect here is the notation {constraint } to denote a logical constraint on
some of the variables; the rule only holds when the constraint is true
Compatible( j, n) is meant to test whether adjective j and noun n are compatible; it would be
defined by a series of asser tions such as Compatible(black, dog).
Conver ting Grammar to Definite Clause:
1. Reversing the order of right - and left-hand sides
2. Making a conjunction of all the constituents and constraints
3. Adding a variable si to the list of arguments for each constituent to represent the
sequence of words spanned by the constituent
4. Adding a term for the concatenation of words, Append(s1,...), to the list of arguments
for the root of the tree
Ar ticle(a, s1) ∧ Adjs( j, s2) ∧ Noun(n, s3) ∧ Compatible( j, n) ⇒ NP(n, Append(s1, s2, s3)) .
• This definite clause says that if the predicate Ar ticle is true of a head word a and a
string s1, and Adjs is similarly true of a head word j and a string s2, and Noun is true
of a head word n and a string s3, and if j and n are compatible, then the predicate NP
is true of the head word n and the result of appending string s s1, s2, and s3.
CASE AGREEMENT AND SUBJECT–VERB AGREEMENT
Linguists say that the pronoun “I” is in the subjective case, and “me” is in the objective case .
simple grammar for 0 over generates, producing non sentences such as “Me smell a
stench”.
We can account for this by splitting NP into two categories, NP S and NP O , to stand for noun
phrases in the subjective and objective case .
We would also need to split the categor y Pronoun into the two categories Pronoun S (which
includes “I”) and Pronoun O (which includes “me”).
GR AMMAR FOR CASE AGREEMENT
1 :
S → NP S VP | ...
NP S → Pronoun S | Name | Noun | ...
NP O → Pronoun O | Name | Noun | ...
VP → VP NP O | ...
PP → Prep NP O
Pronoun S → I | you | he | she | it | ...
Pronoun O → me | you | him | her | it | ...

1 still over generates, English requires subject –verb agreement for person and number of
the subject and main verb of a sentence .
Example:
I smell → Grammatically Correct
I smells → Wrong

It smell → wrong
It smells → Correct
NP(c, pn, head) has three augmentations: c is a parameter for case, pn is a parameter for
person and number, and head is a parameter for the head word of the phrase.

2 :
S(head) → NP(Sbj, pn, h) VP(pn, head) | ...
NP(c, pn, head) → Pronoun(c, pn, head) | Noun(c, pn, head) | ...
VP(pn, head) → VP(pn, head) NP(Obj, p, h) | ...
PP(head) → Prep(head) NP(Obj, pn, h)
Pronoun(Sbj, 1S,I) → I
Pronoun(Sbj , 1P, we) → we
Pronoun(Obj, 1S, me) → me
Pronoun(Obj, 3P,them) → them

5.5 SEMANTIC INTERPRETATION:


A grammar for arithmetic expressions, augmented with semantics
Exp(x) → Exp(x1) Operator (op) Exp(x2) {x = Apply(op, x1, x2)}
Exp(x) → ( Exp(x) )
Exp(x) → Number (x)
Number (x) → Digit(x)
Number (x) → Number(x1) Digit(x2) {x = 10 × x1 + x2}
Digit(x) → x {0 ≤ x ≤ 9}
Operator (x) → x {x∈ {+, −, ÷, ×}}
Parse Tree for 3 + ( 4 ÷ 2 )

5.6 APPLICATIONS WITH NLP


EMAIL FILTERS
Email filters are one of the most basic and initial applications of NLP online. It star ted out with
spam filters, uncovering cer tain words or phrases that signal a spam message. But filtering
has upgraded, just like early adaptations of NLP. One of the mor e prevalent, newer applications
of NLP is found in Gmail's email classification. The system recognizes if emails belong in one
of three categories (primar y, social, or promotions) based on their contents.
SMART ASSISTANTS
Smar t assistants like Apple’s Siri and Amazon’s Alexa recognize patterns in speech thanks to
voice recognition, then infer meaning and provide a useful response.
SEARCH RESULTS
Search engines use NLP to sur face relevant results based on similar search behaviors or user
intent so the averag e person finds what they need without being a search -term wizard. For
example, Google not only predicts what popular searches may apply to your quer y as you star t
typing, but it looks at the whole picture and recognizes what you’re tr ying to say rather tha n
the exact search words
PREDICTIVE TEXT
Things like autocorrect, autocomplete, and predictive text are so commonplace on our
smar tphones that we take them for granted. Autocomplete and predictive text are similar to
search engines in that they predict things to say based on what you type, finishing the word
or suggesting a relevant one. And autocorrect will sometimes even change words so that the
overall message makes more sense.
LANGUAGE TR ANSLATION
With NLP, online translators can translate la nguages more accurately and present
grammatically-correct results. This is infinitely helpful when tr ying to communicate with
someone in another language. Not only that, but when translating from another language to
your own, tools now recognize the langua ge based on inputted text and translate it.
DIGITAL PHONE CALLS
We all hear “this call may be recorded for training purposes,” but rarely do we wonder what
that entails. Turns out, these recordings may be used for training purposes, if a customer is
aggrieved, but most of the time, they go into the database for an NLP system to learn from
and improve in the future. Automated systems direct customer calls to a ser vice representative
or online chatbots, which respond to customer requests with helpful informat ion. This is a NLP
practice that many companies, including large telecommunications providers have put to use.
DATA ANALYSIS
Natural language capabilities are being integrated into data analysis workflows as more BI
vendors offer a natural language inter fa ce to data visualizations.
TEXT ANALYTICS
Text analy tics conver ts unstructured text data into meaningful data for analysis using different
linguistic, statistical, and machine learning techniques. While sentiment analysis sounds
daunting to brands --especially if they have a large customer base --a tool using NLP will
typically scour customer interactions, such as social media comments or reviews, or even
brand name mentions to see what’s being said. Analysis of these interactions can help brands
determine ho w well a marketing campaign is doing or monitor trending customer issues before
they decide how to respond or enhance ser vice for a better customer experience.

5.7 Types of Chatbot


1. Menu/button-based chatbots
2. Linguistic Based (Rule -Based Chatbots)
3. Keyword recognition-based chatbots
4. Machine Learning chatbots
5. The hybrid model
6. Voice bots
MENU/BUTTON -BASED CHATBOTS
Menu/button -based chatbots are the most basic type of chatbots currently implemented in the
market today. In most cases, these chatbots are glorified decision tree hierarchies presented
to the user in the form of buttons. Similar to the automated phone menus we all interact with
on almost a daily basis, these chatbots require the user to make several selections to dig
deeper towards the ultimate answer.
LINGUISTIC BASED (RULE -BASED CHATBOTS)
If you can predict the types of questions your customers may ask, a linguistic chatbot might
be the solution for you. Linguistic or rules -based chatbots create conversational automation
flows using if/then logic. First, you have to define the language conditions of your chatbots.
Conditions can be created to assess the words, the order of the words, synonyms, and more.
If the incoming quer y matches the conditions defined by your chatbot, y our customers can
receive the appropriate help in no time.
KEYWORD RECOGNITION -BASED CHATBOTS
Unlike menu -based chatbots, keyword recognition -based chatbots can listen to what users type
and respond appropriately. These chatbots utilize customizable keywor ds and an AI application
- Natural Language Processing (NLP) to determine how to ser ve an appropriate response to
the user. These types of chatbots fall shor t when they have to answer a lot of similar questions.
The NLP chatbots will star t to slip when the re are keyword redundancies between several
related questions.
MACHINE LEARNING CHATBOTS
These types of chatbots utilize Machine Learning (ML) and Ar tificial Intelligence (AI) to
remember conversations with specific users to learn and grow over time. Unlik e keyword
recognition-based bots, chatbots that have contextual awareness are smar t enough to self -
improve based on what users are asking for and how they are asking it.
VOICE BOTS
To make conversational inter faces even more vernacular, businesses are now beginning to use
voice -based chatbots or voice bots. Voice bots have been on the rise for the last couple of
years, with vir tual assistants like Apple’s Siri, to Amazon’s Alexa,
THE HYBRID MODEL
Businesses love the sophistication of AI -chatbots, but don’t always have the talents or the
large volumes of data to suppor t them. So, they opt for the hybrid model.

You might also like