Professional Documents
Culture Documents
Passage Retrieval & Answer Processing Component: E. Jembere Based On Lecture Slides Form Kathy Mckeown Lecture Slides
Passage Retrieval & Answer Processing Component: E. Jembere Based On Lecture Slides Form Kathy Mckeown Lecture Slides
Passage Retrieval & Answer Processing Component: E. Jembere Based On Lecture Slides Form Kathy Mckeown Lecture Slides
&
Answer Processing
Component
E. Jembere
1
Passage Retrieval
Extracts and ranks passages
using surface-text techniques
WordNet WordNet
Document
Parser Retrieval Parser
NER NER
2
Passage Retrieval
• Passage retrieval starts with document retrieval
• Although the documents will be ranked, the top
ranked document might not be relevant to the
question.
• Passage retrieval proceed as follows
Question types are used to extract the passages with
the expected entities.
Passages that do not contain the expected entities are
filtered out.
The remaining passages are then ranked using some
rules.
3
Features used for Ranking
passages
• The following features are usually used for ranking
passages
The number of named entities of the right type in the passages
The number of question key words in the passage
The longest exact sequence of questions keywords that occurs
in the passage
The rank of the document from which the passage was
extracted
The proximity of the key words from the original query to each
other in the passage.
The N-gram overlap between the passage and the question
04/15/2023 4
Answer Extraction
Extracts and ranks passages
using surface-text techniques
WordNet WordNet
Document
Parser Retrieval Parser
NER NER
5
Answer Processing
• Two main algorithms are used for this purpose
1. Answer-type pattern extraction
Use the information about the expected answer type
together with regular expressions
2. N-gram tiling
Uses combination of N-Grams from a reformulated query to
construct an answer to the question
04/15/2023 6
Answer-type pattern extraction
7
Answer-type pattern extraction
8
Some examples
9
N-Gram Tiling:AskMSR
10
Step 1: Rewrite the questions
• Intuition: The user’s question is often
syntactically quite close to sentences that
contain the answer
11
Step 2: Query search engine
12
Step 3: Gathering N-Grams
• Enumerate all N-grams (N=1,2,3) in all retrieved
snippets
• Weight of an n-gram: occurrence count
• Example: “Who created the character of Scrooge?”
Dickens 117
Christmas Carol 78
Charles Dickens 75
Disney 72
Carl Banks 54
A Christmas 41
Christmas Carol 45
Uncle 31
13
Step 4: Filtering N-Grams
• Each question type is associated with one or more “data-
type filters”
• Boost score of n-grams that match the expected answer
type.
• Lower score of n-grams that don’t match.
• For example
The filter for
How many dogs pull a sled?
prefers a number
So disprefer candidate n-grams like
Dog race, run, Alaskan, dog racing
Prefer candidate n-grams like
Pool of 16 dogs
14
Step 5: Tiling the Answers
Scores
20 Charles Dickens
merged, discard
15 Dickens old n-grams
Mr Charles
10
15