Professional Documents
Culture Documents
Syntax Parsing: Lecture # 6
Syntax Parsing: Lecture # 6
lecture # 6
6001414-3
Natural Language Processing
Outline
■ What is Syntax?
■ Why should you care?
■ Two Views of Syntax
• Previous lectures:
• ordering of strings of words
• how to compute probabilistic
• part of speech categories
■ Today’s lecture: context-free grammars – formal models of
grammar and syntax
What is Syntax?
• Words do not occur in isolation – they combine into sentences
• Syntax– refers to the way words are arranged together (the
study of sentence structure)
– rules for constructing grammatical sentences and
determining their meaning
– “Who does what to whom?”
• A grammar is a set of rules that govern the composition of
sentences
• Parsing refers to the process of analyzing an utterance in
terms of its syntactic structure
Parsing (Syntactic Structure)
INPUT: Boeing is located in Seattle
OUTPUT
Why should you care?
■ Syntactic information is important for many tasks:
– Question answering
■ What books did he like?
– Grammar checking
■ He is friend of mine.
– Information extraction
■ Oracle acquired Sun.
■ Parsing provides “framework” for semantic analysis
Two Views of Syntax
■ Dependency:
– Syntactic structure resides in relations between words
– Focus on the functional roles (subject, object, . . . )
■ Constituency:
– Syntactic structure consists in the composition of phrases
– Focus on structural categories (noun phrase, verb phrase,...)
Constituency
■ Syntactic structure represented
by phrase structure trees
– Words represented by
terminal tree nodes (leaves)
– Phrases represented by
internal tree nodes
– Phrase types specified by
node labels
■ Phrase structure trees can be
defined by context-free
grammars
Constituency cont’
■ A basic observation about syntactic structure is that groups
of words can act as single units
– Los Angeles
– a high-class spot such as Mindy’s
– three parties from Brooklyn
– they.
■ Such groups of words are called constituents
■ Constituents tend to have similar internal structure, and
behave similarly with respect to other units
Examples of constituents
■ noun phrases (NP)
she, the house, Robin Hood and his merry men, a
high-class spot such as Mindy’s
■ verb phrases (VP)
blushed, loves Mary, was told to sit down and be
quiet, lived happily ever after
■ prepositional phrases (PP)
on it, with the telescope, through the foggy dew,
apart from everything I have said so far
Immediate Constituency Analysis
We can find constituents by recursive decomposition:
The girl in the corner wears a yellow hat and dark sunglasses.
The girl in the corner + wears a yellow hat and dark sunglasses.
[The girl + in the corner] [wears + a yellow hat and dark sunglasses].
[The + girl] [in + the corner] wears [a yellow hat + and + dark
sunglasses].
The girl in [the + corner] wears [a + yellow hat] and [dark + sunglasses].
The girl in the corner wears a [yellow + hat] and dark sunglasses.
Constituent Types
■ Noun phrase (NP)
– Head is a noun
– Typically functions as subject or object
– Examples:
■ determiner + noun: the dog,
■ proper name: Barack Obama, Japan
■ pronoun: he, they, Agreement – e.g. number, gender,
definiteness ◮ Head determines agreement
Constituent Types cont’
■ Prepositional phrase (PP)
– Head is a preposition
– Followed by an NP (prepositional argument)
– Examples:
■ prep + NP in the garden, over the rooftops
■ Verb phrase (VP)
– Head is a verb (finite/non-finite)
– All elements of the sentence except the subject
– Examples:
■ verb sleeps, danced
■ verb + NP: ate the cake
■ verb+ NP+ NP: gave him the cake
■ verb + NP + PP put all the papers in the drawer
Phrase structure grammars
■ Capture constituent status and ordering
■ Formal model: context-free grammar
1. S→NP VP
2. NP→DN
3. VP→VNP
■ Syntactic structure as phrase structure trees
Context-free grammars (CFGs)
■ Mathematical system for modeling constituency in English
– CFGs are also called Phrase-Structure Grammars
■ A CFG consists of
– • rules or productions – expressing the way in which
symbols of the language can be grouped together, and
– • a lexicon of words and symbols.
■ A CFG can be thought of in two ways
• A device for generating sentences
• A device for assigning structure to a sentence
■ True or false
1. The substring dog is a noun phrase
2. The substring little dog is a noun phrase
3. The substring the little dog is a noun phrase
4. The substring scared the little dog is a noun phrase
5. The substring scared the little dog is a verb phrase
Syntactic parsing
■ Given a string of terminals and a CFG, determine if
the string can be generated by the CFG.
– Also return a parse tree for the string
– Also return all possible parse trees for the string
– ◮ [DT][woman reports]
– ◮ [DT woman][reports]
– ◮ [DT NN][reports]
– ◮ [NP][reports]
– ◮ [ NP reports] [ ]
– ◮ etc.
Parsing Example
S
VP
Verb NP
book that flight
book Det Nominal
that Noun
flight
Top Down Parsing
S
NP VP
Pronoun
Top Down Parsing
S
NP VP
Pronoun
X
book
Top Down Parsing
S
NP VP
ProperNoun
Top Down Parsing
S
NP VP
ProperNoun
X
book
Top Down Parsing
S
NP VP
Det Nominal
Top Down Parsing
S
NP VP
Det Nominal
X
book
Top Down Parsing
S
VP
Top Down Parsing
S
VP
Verb
Top Down Parsing
S
VP
Verb
book
Top Down Parsing
S
VP
Verb
X
book that
Top Down Parsing
S
VP
Verb NP
Top Down Parsing
S
VP
Verb NP
book
Top Down Parsing
S
VP
Verb NP
book Pronoun
Top Down Parsing
S
VP
Verb NP
book Pronoun
X
that
Top Down Parsing
S
VP
Verb NP
book ProperNoun
Top Down Parsing
S
VP
Verb NP
book ProperNoun
X
that
Top Down Parsing
S
VP
Verb NP
VP
Verb NP
that
Top Down Parsing
S
VP
Verb NP
that Noun
Top Down Parsing
S
VP
Verb NP
that Noun
flight
Bottom Up Parsing
Noun
43
Bottom Up Parsing
Nominal
Noun
44
Bottom Up Parsing
Nominal
Nominal Noun
Noun
45
Bottom Up Parsing
Nominal
Nominal Noun
X
Noun
46
Bottom Up Parsing
Nominal
Nominal PP
Noun
47
Bottom Up Parsing
Nominal
Nominal PP
Noun Det
48
Bottom Up Parsing
Nominal
Nominal PP
NP
Noun Det Nominal
49
Bottom Up Parsing
Nominal
Nominal PP
NP
Noun Det Nominal
flight
50
Bottom Up Parsing
Nominal
Nominal PP
NP
Noun Det Nominal
flight
51
Bottom Up Parsing
Nominal
Nominal PP S
NP VP
Noun Det Nominal
flight
52
Bottom Up Parsing
Nominal
Nominal PP S
NP VP
Noun Det Nominal X
book that Noun
flight
53
Bottom Up Parsing
Nominal
Nominal PP
X
NP
Noun Det Nominal
flight
54
Bottom Up Parsing
NP
Verb Det Nominal
flight
55
Bottom Up Parsing
VP
NP
Verb Det Nominal
flight
56
Bottom Up Parsing
VP
NP
Verb Det Nominal
flight
57
Bottom Up Parsing
VP X
NP
Verb Det Nominal
flight
58
Bottom Up Parsing
VP
VP PP
NP
Verb Det Nominal
flight
Bottom Up Parsing
VP
VP PP
X NP
Verb Det Nominal
flight
Bottom Up Parsing
VP
NP
NP
Verb Det Nominal
flight
61
Bottom Up Parsing
VP
NP
Verb Det Nominal
flight
62
Bottom Up Parsing
VP
NP
Verb Det Nominal
flight
63
DERIVATION Rule Used
• S • S → NP VP
• NP VP • NP → he
• he VP • VP → VP PP
• he VP PP • VP → VB PP
• he VB PP PP • VB → drove
• he drove PP PP • PP → down the
• he drove down the street PP street
• he drove down the street in the car • PP → in the car
Treebanks
■ Constituency treebanks
– Treebanks with constituency-based annotation
– Example: Penn Treebank of English
■ Treebank grammars
– We can extract CFGs from constituency treebanks
– Treebank grammars can be used to build syntactic
parsers