Lecture 2 Hierarchy of NLP & TF-IDF

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 48

Amity School of Engineering and Technology

MODULE III
Understanding Natural Languages
TF-IDF
2
3
4
5
6
7
8
9
10
11
Q1. Apply Bag of words (BOW) method on following sentences and convert
to the vector form:
Sentence 1: This movie is very scary and long
Sentence 2: This movie is not scary and is slow
Sentence 3: This movie is spooky and good

Q2. Apply TF-IDF method on the following sentences and draw


the final table having features value:

Sentence 1: This movie is very scary and long


Sentence 2: This movie is not scary and is slow
Sentence 3: This movie is spooky and good

12
Topic Modeling
• Uncovering hidden structures in sets of
texts or documents.
• Groups texts to discover latent topics.
• Assumes each document consists of a
mixture of topics and that each topic
consists of a set of words.

13
Topic Modeling
(Example)

14
Parsing
• Breaking down a given sentence into its
grammatical constituents.
• Example:
• “Who won the cricket worldcup in 2019?”
• “The swift black cat jumps over the wall”

15
Part-of-speech (POS) tagging

• According to the role of a word in a


sentence, it can be tagged as a noun,
verb, adjective, adverb, preposition, etc.
• Correct tags such as nouns, verbs,
adjectives, etc. should be assigned.

16
Constituency parsing
• Need to identify and define commonly
seen grammatical patterns.
• Divide words into groups, called
constituents, based on their grammatical
role in the sentence.
• Example:
• ‘Amitian — read — an article on Syntactic
Analysis’

17
Dependency Parsing
• Dependencies are established between
words themselves.
• Example:
• ‘Amitians attend classes’

18
Co-reference resolution
• Coreference resolution is the task of
finding all expressions that refer to the
same entity in a text.
Example:Two entities as ‘Michael Cohen’
and ‘Mr. Trump’

19
Word sense
disambiguation
• NLP involves resolving different kinds of
ambiguity.
• A word can take different meanings
making it ambiguous to understand.
• Word sense disambiguation (WSD) means
selecting the correct word sense for a
particular word.

20
Word sense
disambiguation
• Example:
• The word “bank”. It can refer to a financial
institution or the land alongside a river.
• These different meanings are called word
senses.
• Context can be used effectively to perform
WSD.

21
Named entity
recognition
• Identification of named entities such as
persons, locations, organisations which
are denoted by proper nouns.
• Example:
• “Michael Jordan is a professor at
Berkeley.”

22
Context free grammars
• It is the grammar that consists rules with a
single symbol on the left-hand side of the
rewrite rules. Let us create grammar to
parse a sentence
• “The bird pecks the grains”

23
Context free grammars

24
Context free grammars
• The parse tree breaks down the sentence
into structured parts so that the computer
can easily understand and process it.
• In order for the parsing algorithm to
construct this parse tree, a set of rewrite
rules, which describe what tree structures
are legal, need to be constructed.

25
Context free grammars
• These rules say that a certain symbol may
be expanded in the tree by a sequence of
other symbols.
• According to first order logic rule, if there
are two strings Noun Phrase (NP) and
Verb Phrase (VP), then the string
combined by NP followed by VP is a
sentence.

26
Context free grammars
• The rewrite rules for the sentence are as
follows −

27
Context free grammars
• The parse tree can be created as shown −

28
Context free grammars
• Now consider the above rewrite rules.
Since V can be replaced by both, "peck" or
"pecks", sentences such as "The bird peck
the grains" can be wrongly permitted.
i. e. the subject-verb agreement error is
approved as correct.

29
Context free grammars
• Merit − The simplest style of grammar,
therefore widely used one.
• Demerits −
They are not highly precise. For example,
“The grains peck the bird”, is a
syntactically correct according to parser,
but even if it makes no sense, parser
takes it as a correct sentence.

30
Context free grammars
• Demerits
 To bring out high precision, multiple sets of
grammar need to be prepared.
 It may require a completely different sets
of rules for parsing singular and plural
variations, passive sentences, etc., which
can lead to creation of huge set of rules
that are unmanageable.

31
Transformational
Grammar
• These are the grammars in which the
sentence can be represented structurally
into two stages.
• Obtaining different structures from
sentences having the same meaning is
undesirable in language understanding
systems.
• Sentences with the same meaning should
always correspond to the same internal
knowledge structures. 32
Transformational
Grammar
• In one stage the basic structure of the
sentence is analyzed to determine the
grammatical constituent parts and in the
second stage just the vice versa of the first
one.
• This reveals the surface structure of the
sentence, the way the sentence is used in
speech or in writing.

33
Transformational Grammar

• Alternatively, we can also say that


application of the transformation rules can
produce a change from passive voice to
active voice and vice versa.

34
Transformational Grammar

35
• Both of the above sentences are two
different sentences but they have same
meaning.
• Thus it is an example of a transformational
grammar.
• These grammars were never widely used in
computational models of natural language.
• The applications of this grammar are
changing of voice (Active to Passive and
Passive to Active) change a question to
declarative form etc. 36
TRANSITION NETWORK

• It is a method to represent the natural


languages. It is based on applications of
directed graphs and finite state automata.
• The transition network can be constructed
by the help of some inputs, states and
outputs.
• A transition network may consist of some
states or nodes, some labeled arcs from
one state to the next state through which it
will move. 37
• The arc represents the rule or some
conditions upon which the transition is
made from one state to another state.
• For example, a transition network is used
to recognize a sentence consisting of an
article, a noun, an auxiliary, a verb, an
article, a noun would be represented by
the transition network as follows.

38
39
• The transition from N1 to N2 will be made if
an article is the first input symbol.
• If successful, state N2 is entered.
• The transition from N2 to N3 can be made if
a noun is found next.
• If successful, state N3 is entered.
• The transition from N3 to N4 can be made if
an auxiliary is found and so on.
40
• Suppose consider a sentence “A boy is
eating a banana”.
• So if the sentence is parsed in the above
transition network then, first ‘A’ is an
article.
• So successful transition to the node N1 to
N2. Then boy is a noun (so N2 to N3), “is” is
an auxiliary (N5 to N6) and finally “banana”
is a noun (N 6 to N7) is done successfully.
• So the above sentence is successfully
41
TYPES OF TRANSITION
NETWORK
• There are generally two types of transition
networks like
1.Recursive Transition networks (RTN)
2.Augmented Transition networks (ATN)

42
Recursive Transition Networks (RTN)

• RTNs are considered as development for


finite state automata with some essential
conditions to take the recursive
complexion for some definitions in
consideration.
• A recursive transition network consists of
nodes (states) and labeled arcs
(transitions).

43
• It permits arc labels to refer to other
networks and they in turn may refer back
to the referring network rather than just
permitting word categories.
• It is a modified version of transition
network.
• It allows arc labels that refer to other
networks rather than word category.

44
Augmented Transition Network
(ATN)
• An ATN is a modified transition network.
• It is an extension of RTN.
• The ATN uses a topdown parsing
procedure to gather various types of
information to be later used for
understanding system.
• It produces the data structure suitable for
further processing and capable of storing
semantic details.
45
• An augmented transition network (ATN) is
a recursive transition network that can
perform tests and take actions during arc
transitions.
• An ATN uses a set of registers to store
information.
• A set of actions is defined for each arc and
the actions can look at and modify the
registers.
• An arc may have a test associated with it.
46
• The arc is traversed (and its action) is
taken only if the test succeeds.
• When a lexical arc is traversed, it is put in
a special variable (*) that keeps track of
the current word.
• The ATN was first used in LUNAR system.
• In ATN, the arc can have a further arbitrary
test and an arbitrary action.

47
The structure of ATN

48

You might also like