Download as pdf or txt
Download as pdf or txt
You are on page 1of 65

SYNTAX PARSING

lecture # 6

6001414-3
Natural Language Processing
Outline
■ What is Syntax?
■ Why should you care?
■ Two Views of Syntax
• Previous lectures:
• ordering of strings of words
• how to compute probabilistic
• part of speech categories
■ Today’s lecture: context-free grammars – formal models of
grammar and syntax
What is Syntax?
• Words do not occur in isolation – they combine into sentences
• Syntax– refers to the way words are arranged together (the
study of sentence structure)
– rules for constructing grammatical sentences and
determining their meaning
– “Who does what to whom?”
• A grammar is a set of rules that govern the composition of
sentences
• Parsing refers to the process of analyzing an utterance in
terms of its syntactic structure
Parsing (Syntactic Structure)
INPUT: Boeing is located in Seattle

OUTPUT
Why should you care?
■ Syntactic information is important for many tasks:
– Question answering
■ What books did he like?
– Grammar checking
■ He is friend of mine.
– Information extraction
■ Oracle acquired Sun.
■ Parsing provides “framework” for semantic analysis
Two Views of Syntax
■ Dependency:
– Syntactic structure resides in relations between words
– Focus on the functional roles (subject, object, . . . )

■ Constituency:
– Syntactic structure consists in the composition of phrases
– Focus on structural categories (noun phrase, verb phrase,...)
Constituency
■ Syntactic structure represented
by phrase structure trees
– Words represented by
terminal tree nodes (leaves)
– Phrases represented by
internal tree nodes
– Phrase types specified by
node labels
■ Phrase structure trees can be
defined by context-free
grammars
Constituency cont’
■ A basic observation about syntactic structure is that groups
of words can act as single units
– Los Angeles
– a high-class spot such as Mindy’s
– three parties from Brooklyn
– they.
■ Such groups of words are called constituents
■ Constituents tend to have similar internal structure, and
behave similarly with respect to other units
Examples of constituents
■ noun phrases (NP)
she, the house, Robin Hood and his merry men, a
high-class spot such as Mindy’s
■ verb phrases (VP)
blushed, loves Mary, was told to sit down and be
quiet, lived happily ever after
■ prepositional phrases (PP)
on it, with the telescope, through the foggy dew,
apart from everything I have said so far
Immediate Constituency Analysis
We can find constituents by recursive decomposition:
The girl in the corner wears a yellow hat and dark sunglasses.
The girl in the corner + wears a yellow hat and dark sunglasses.
[The girl + in the corner] [wears + a yellow hat and dark sunglasses].
[The + girl] [in + the corner] wears [a yellow hat + and + dark
sunglasses].
The girl in [the + corner] wears [a + yellow hat] and [dark + sunglasses].
The girl in the corner wears a [yellow + hat] and dark sunglasses.
Constituent Types
■ Noun phrase (NP)
– Head is a noun
– Typically functions as subject or object
– Examples:
■ determiner + noun: the dog,
■ proper name: Barack Obama, Japan
■ pronoun: he, they, Agreement – e.g. number, gender,
definiteness ◮ Head determines agreement
Constituent Types cont’
■ Prepositional phrase (PP)
– Head is a preposition
– Followed by an NP (prepositional argument)
– Examples:
■ prep + NP in the garden, over the rooftops
■ Verb phrase (VP)
– Head is a verb (finite/non-finite)
– All elements of the sentence except the subject
– Examples:
■ verb sleeps, danced
■ verb + NP: ate the cake
■ verb+ NP+ NP: gave him the cake
■ verb + NP + PP put all the papers in the drawer
Phrase structure grammars
■ Capture constituent status and ordering
■ Formal model: context-free grammar
1. S→NP VP
2. NP→DN
3. VP→VNP
■ Syntactic structure as phrase structure trees
Context-free grammars (CFGs)
■ Mathematical system for modeling constituency in English
– CFGs are also called Phrase-Structure Grammars
■ A CFG consists of
– • rules or productions – expressing the way in which
symbols of the language can be grouped together, and
– • a lexicon of words and symbols.
■ A CFG can be thought of in two ways
• A device for generating sentences
• A device for assigning structure to a sentence

Parse tree derived for the string “a flight”


Context-free grammars (CFGs) cont’
■ Formally, a CFG is a 4-tuple < N, Σ, R, S >, where
■ N is a set of non-terminal symbols (syntactic categories)
■ Σ is a set of terminal symbols (words)
■ R is a set of rules A→α, where
– A is a non-terminal
– α is a string of symbols taken from the set (Σ ∪ N)∗
■ S is a designated start symbol
Rules in Context Free Grammars
Grammar rule Example
S → NP VP I + want a morning flight
NP → Pronoun I
NP → Proper-Noun Los Angeles
NP → Det Nominal a flight
Nominal → Nominal Noun morning flight
Nominal → Noun flights
VP →Verb do
VP →Verb NP want + a flight
VP → Verb NP PP leave + Boston + in the morning
VP → Verb PP leaving + on Thursday
PP → Preposition NP from + Los Angeles
Quiz
The big bear scared the little dog.

■ True or false
1. The substring dog is a noun phrase
2. The substring little dog is a noun phrase
3. The substring the little dog is a noun phrase
4. The substring scared the little dog is a noun phrase
5. The substring scared the little dog is a verb phrase
Syntactic parsing
■ Given a string of terminals and a CFG, determine if
the string can be generated by the CFG.
– Also return a parse tree for the string
– Also return all possible parse trees for the string

■ automatically determining the syntactic structure for a given


sentence
– search through all possible trees for a sentence
– bottom-up vs top-down approaches
Top-down
■ builds structure from root of tree (S) to leaves
■ operates with a list of constituents to be built and rewrites them
by matching their category to the grammar rules
■ Common search strategy
◦ Top-down, left-to-right, backtracking
◦ expand all constituents in these trees/rules
◦ Continue until leaves are POS
◦ Backtrack when candidate POS does not match input string
■ ◮ [S]
◮ [NP VP]
◮ [DT NN VP][NP PP VP]
◮ etc.
Bottom-up
■ starts with the words and tries to build the trees from them
and up
■ Parse continues until an S root node reached or no further
node expansion possible
– ◮ [the][the woman reports]

– ◮ [DT][woman reports]

– ◮ [DT woman][reports]

– ◮ [DT NN][reports]

– ◮ [NP][reports]

– ◮ [ NP reports] [ ]

– ◮ etc.
Parsing Example
S

VP

Verb NP
book that flight
book Det Nominal

that Noun

flight
Top Down Parsing
S

NP VP

Pronoun
Top Down Parsing
S

NP VP

Pronoun

X
book
Top Down Parsing
S

NP VP

ProperNoun
Top Down Parsing
S

NP VP

ProperNoun

X
book
Top Down Parsing
S

NP VP

Det Nominal
Top Down Parsing
S

NP VP

Det Nominal
X
book
Top Down Parsing
S

VP
Top Down Parsing
S

VP

Verb
Top Down Parsing
S

VP

Verb

book
Top Down Parsing
S

VP

Verb

X
book that
Top Down Parsing
S

VP

Verb NP
Top Down Parsing
S

VP

Verb NP

book
Top Down Parsing
S

VP

Verb NP

book Pronoun
Top Down Parsing
S

VP

Verb NP

book Pronoun
X
that
Top Down Parsing
S

VP

Verb NP

book ProperNoun
Top Down Parsing
S

VP

Verb NP

book ProperNoun
X
that
Top Down Parsing
S

VP

Verb NP

book Det Nominal


Top Down Parsing
S

VP

Verb NP

book Det Nominal

that
Top Down Parsing
S

VP

Verb NP

book Det Nominal

that Noun
Top Down Parsing
S

VP

Verb NP

book Det Nominal

that Noun

flight
Bottom Up Parsing

book that flight


Bottom Up Parsing

Noun

book that flight

43
Bottom Up Parsing

Nominal

Noun

book that flight

44
Bottom Up Parsing

Nominal

Nominal Noun

Noun

book that flight

45
Bottom Up Parsing

Nominal

Nominal Noun

X
Noun

book that flight

46
Bottom Up Parsing

Nominal

Nominal PP

Noun

book that flight

47
Bottom Up Parsing

Nominal

Nominal PP

Noun Det

book that flight

48
Bottom Up Parsing

Nominal

Nominal PP
NP
Noun Det Nominal

book that flight

49
Bottom Up Parsing

Nominal

Nominal PP
NP
Noun Det Nominal

book that Noun

flight
50
Bottom Up Parsing

Nominal

Nominal PP
NP
Noun Det Nominal

book that Noun

flight
51
Bottom Up Parsing

Nominal

Nominal PP S

NP VP
Noun Det Nominal

book that Noun

flight
52
Bottom Up Parsing

Nominal

Nominal PP S

NP VP
Noun Det Nominal X
book that Noun

flight
53
Bottom Up Parsing

Nominal

Nominal PP
X
NP
Noun Det Nominal

book that Noun

flight
54
Bottom Up Parsing

NP
Verb Det Nominal

book that Noun

flight
55
Bottom Up Parsing

VP
NP
Verb Det Nominal

book that Noun

flight
56
Bottom Up Parsing

VP
NP
Verb Det Nominal

book that Noun

flight
57
Bottom Up Parsing

VP X
NP
Verb Det Nominal

book that Noun

flight
58
Bottom Up Parsing

VP

VP PP
NP
Verb Det Nominal

book that Noun

flight
Bottom Up Parsing

VP

VP PP
X NP
Verb Det Nominal

book that Noun

flight
Bottom Up Parsing

VP
NP
NP
Verb Det Nominal

book that Noun

flight
61
Bottom Up Parsing

VP
NP
Verb Det Nominal

book that Noun

flight
62
Bottom Up Parsing

VP
NP
Verb Det Nominal

book that Noun

flight
63
DERIVATION Rule Used
• S • S → NP VP
• NP VP • NP → he
• he VP • VP → VP PP
• he VP PP • VP → VB PP
• he VB PP PP • VB → drove
• he drove PP PP • PP → down the
• he drove down the street PP street
• he drove down the street in the car • PP → in the car
Treebanks
■ Constituency treebanks
– Treebanks with constituency-based annotation
– Example: Penn Treebank of English
■ Treebank grammars
– We can extract CFGs from constituency treebanks
– Treebank grammars can be used to build syntactic
parsers

You might also like