Lecture 2 Hierarchy of NLP & TF-IDF

Amity School of Engineering and Technology
MODULE III
Understanding Natural Languages
TF-IDF
2
3
4
5
6
7
8
9
10
11
Q1. Apply Bag of words (BOW) method on following sentences and convert
to the vector form:
Sentence 1: This movie is very scary and long
Sentence 2: This movie is not scary and is slow
Sentence 3: This movie is spooky and good
Q2. Apply TF-IDF method on the following sentences and draw

the final table having features value:
Sentence 1: This movie is very scary and long

Sentence 2: This movie is not scary and is slow
Sentence 3: This movie is spooky and good
12
Topic Modeling
• Uncovering hidden structures in sets of
texts or documents.
• Groups texts to discover latent topics.
• Assumes each document consists of a
mixture of topics and that each topic
consists of a set of words.
13
Topic Modeling
(Example)
14
Parsing
• Breaking down a given sentence into its
grammatical constituents.
• Example:
• “Who won the cricket worldcup in 2019?”
• “The swift black cat jumps over the wall”
15
Part-of-speech (POS) tagging
• According to the role of a word in a

sentence, it can be tagged as a noun,
verb, adjective, adverb, preposition, etc.
• Correct tags such as nouns, verbs,
adjectives, etc. should be assigned.
16
Constituency parsing
• Need to identify and define commonly
seen grammatical patterns.
• Divide words into groups, called
constituents, based on their grammatical
role in the sentence.
• Example:
• ‘Amitian — read — an article on Syntactic
Analysis’
17
Dependency Parsing
• Dependencies are established between
words themselves.
• Example:
• ‘Amitians attend classes’
18
Co-reference resolution
• Coreference resolution is the task of
finding all expressions that refer to the
same entity in a text.
Example:Two entities as ‘Michael Cohen’
and ‘Mr. Trump’
19
Word sense
disambiguation
• NLP involves resolving different kinds of
ambiguity.
• A word can take different meanings
making it ambiguous to understand.
• Word sense disambiguation (WSD) means
selecting the correct word sense for a
particular word.
20
Word sense
disambiguation
• Example:
• The word “bank”. It can refer to a financial
institution or the land alongside a river.
• These different meanings are called word
senses.
• Context can be used effectively to perform
WSD.
21
Named entity
recognition
• Identification of named entities such as
persons, locations, organisations which
are denoted by proper nouns.
• Example:
• “Michael Jordan is a professor at
Berkeley.”
22
Context free grammars
• It is the grammar that consists rules with a
single symbol on the left-hand side of the
rewrite rules. Let us create grammar to
parse a sentence
• “The bird pecks the grains”
23
24
• The parse tree breaks down the sentence
into structured parts so that the computer
can easily understand and process it.
• In order for the parsing algorithm to
construct this parse tree, a set of rewrite
rules, which describe what tree structures
are legal, need to be constructed.
25
• These rules say that a certain symbol may
be expanded in the tree by a sequence of
other symbols.
• According to first order logic rule, if there
are two strings Noun Phrase (NP) and
Verb Phrase (VP), then the string
combined by NP followed by VP is a
sentence.
26
• The rewrite rules for the sentence are as
follows −
27
• The parse tree can be created as shown −
28
• Now consider the above rewrite rules.
Since V can be replaced by both, "peck" or
"pecks", sentences such as "The bird peck
the grains" can be wrongly permitted.
i. e. the subject-verb agreement error is
approved as correct.
29
• Merit − The simplest style of grammar,
therefore widely used one.
• Demerits −
They are not highly precise. For example,
“The grains peck the bird”, is a
syntactically correct according to parser,
but even if it makes no sense, parser
takes it as a correct sentence.
30
• Demerits
 To bring out high precision, multiple sets of
grammar need to be prepared.
 It may require a completely different sets
of rules for parsing singular and plural
variations, passive sentences, etc., which
can lead to creation of huge set of rules
that are unmanageable.
31
Transformational
Grammar
• These are the grammars in which the
sentence can be represented structurally
into two stages.
• Obtaining different structures from
sentences having the same meaning is
undesirable in language understanding
systems.
• Sentences with the same meaning should
always correspond to the same internal
knowledge structures. 32
Transformational
Grammar
• In one stage the basic structure of the
sentence is analyzed to determine the
grammatical constituent parts and in the
second stage just the vice versa of the first
one.
• This reveals the surface structure of the
sentence, the way the sentence is used in
speech or in writing.
33
Transformational Grammar
• Alternatively, we can also say that

application of the transformation rules can
produce a change from passive voice to
active voice and vice versa.
34
Transformational Grammar
35
• Both of the above sentences are two
different sentences but they have same
meaning.
• Thus it is an example of a transformational
grammar.
• These grammars were never widely used in
computational models of natural language.
• The applications of this grammar are
changing of voice (Active to Passive and
Passive to Active) change a question to
declarative form etc. 36
TRANSITION NETWORK
• It is a method to represent the natural

languages. It is based on applications of
directed graphs and finite state automata.
• The transition network can be constructed
by the help of some inputs, states and
outputs.
• A transition network may consist of some
states or nodes, some labeled arcs from
one state to the next state through which it
will move. 37
• The arc represents the rule or some
conditions upon which the transition is
made from one state to another state.
• For example, a transition network is used
to recognize a sentence consisting of an
article, a noun, an auxiliary, a verb, an
article, a noun would be represented by
the transition network as follows.
38
39
• The transition from N1 to N2 will be made if
an article is the first input symbol.
• If successful, state N2 is entered.
• The transition from N2 to N3 can be made if
a noun is found next.
• If successful, state N3 is entered.
• The transition from N3 to N4 can be made if
an auxiliary is found and so on.
40
• Suppose consider a sentence “A boy is
eating a banana”.
• So if the sentence is parsed in the above
transition network then, first ‘A’ is an
article.
• So successful transition to the node N1 to
N2. Then boy is a noun (so N2 to N3), “is” is
an auxiliary (N5 to N6) and finally “banana”
is a noun (N 6 to N7) is done successfully.
• So the above sentence is successfully
41
TYPES OF TRANSITION
NETWORK
• There are generally two types of transition
networks like
1.Recursive Transition networks (RTN)
2.Augmented Transition networks (ATN)
42
Recursive Transition Networks (RTN)
• RTNs are considered as development for

finite state automata with some essential
conditions to take the recursive
complexion for some definitions in
consideration.
• A recursive transition network consists of
nodes (states) and labeled arcs
(transitions).
43
• It permits arc labels to refer to other
networks and they in turn may refer back
to the referring network rather than just
permitting word categories.
• It is a modified version of transition
network.
• It allows arc labels that refer to other
networks rather than word category.
44
Augmented Transition Network
(ATN)
• An ATN is a modified transition network.
• It is an extension of RTN.
• The ATN uses a topdown parsing
procedure to gather various types of
information to be later used for
understanding system.
• It produces the data structure suitable for
further processing and capable of storing
semantic details.
45
• An augmented transition network (ATN) is
a recursive transition network that can
perform tests and take actions during arc
transitions.
• An ATN uses a set of registers to store
information.
• A set of actions is defined for each arc and
the actions can look at and modify the
registers.
• An arc may have a test associated with it.
46
• The arc is traversed (and its action) is
taken only if the test succeeds.
• When a lexical arc is traversed, it is put in
a special variable (*) that keeps track of
the current word.
• The ATN was first used in LUNAR system.
• In ATN, the arc can have a further arbitrary
test and an arbitrary action.
47
The structure of ATN
48

Lecture 2 Hierarchy of NLP & TF-IDF

Uploaded by

Copyright:

Available Formats

You might also like

Lecture 2 Hierarchy of NLP & TF-IDF

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 2 Hierarchy of NLP & TF-IDF

Uploaded by

Copyright:

Available Formats

Amity School of Engineering and Technology

Q2. Apply TF-IDF method on the following sentences and draw

Sentence 1: This movie is very scary and long

• According to the role of a word in a

• Alternatively, we can also say that

• It is a method to represent the natural

• RTNs are considered as development for

You might also like

Lecture 2 Hierarchy of NLP &amp; TF-IDF

Uploaded by

Copyright:

Available Formats

You might also like

Lecture 2 Hierarchy of NLP &amp; TF-IDF

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 2 Hierarchy of NLP &amp; TF-IDF

Uploaded by

Copyright:

Available Formats

Amity School of Engineering and Technology

Q2. Apply TF-IDF method on the following sentences and draw

Sentence 1: This movie is very scary and long

• According to the role of a word in a

• Alternatively, we can also say that

• It is a method to represent the natural

• RTNs are considered as development for

You might also like

Lecture 2 Hierarchy of NLP & TF-IDF

Lecture 2 Hierarchy of NLP & TF-IDF

Lecture 2 Hierarchy of NLP & TF-IDF