Professional Documents
Culture Documents
IR Slides Lec08b
IR Slides Lec08b
Information Retrieval
Question Answering
Gintarė Grigonytė
gintare@ling.su.se
Department of Linguistics and Philology
Uppsala University
Slides based on previous IR course given by Jörg Tiedemann 2013-15
Information Retrieval
Question Answering
Question Answering
Question Answering (QA) is answering a question posed in natural
language and has to deal with a wide range of question types including: fact,
list, definition, how, why, hypothetical, semantically constrained, and
cross-lingual questions.
Question Answering
Question Answering (QA) is answering a question posed in natural
language and has to deal with a wide range of question types including: fact,
list, definition, how, why, hypothetical, semantically constrained, and
cross-lingual questions.
Motivation:
→ Very ambitious!
I What is RSI?
I Website with “Common misconceptions about RSI” ....
RSI is the same as a mouse arm.
Information Extraction
Information Extraction
Motivation:
Information Extraction
Sub-tasks:
Information Extraction
Challenges
Evaluation in QA
Next steps
Mean reciprocal ranks: The mean of the reciprocal rank of the first
passage retrieved that contains a correct answer.
N
1 X 1
MRRIR =
N rank (first_relevant_passage)
1
Mean reciprocal ranks: The mean of the reciprocal rank of the first
passage retrieved that contains a correct answer.
N
1 X 1
MRRIR =
N rank (first_relevant_passage)
1
I coverage: 1
I redundancy: 2
I MRRIR : 1/2 = 0.5
100
IR coverage
80 10
coverage/MRR (in %)
redundancy
60 7.5
40 5
20 2.5
QA MRR IR redundancy
0
0 20 40 60 80 100
number of paragraphs retrieved
I two-step procedure:
1. standard document retrieval
2. segmentation + passage selection
→ seach-time passaging
80
Answer redundancy
3
% coverage
60 2
1
40
0
0 50 100 150 200 0 50 100 150 200
Rank Rank
Okapi Okapi
Approach 1 Approach 1
Approach 2 Approach 2
dotted line = index-time passaging (
Approach 3 (Roberts & Gaizauskas,
Approach 3 2004))
Approach 4 Approach 4
5
→ not much to gain with two-step procedure
4
I retrieve less
→ lower risk for wrong answer exraction
→ increased efficiency
Now we have
I expected answer type (→ question type)
I relevant passages
I retrieval scores (ranking)
I (syntactically) analysed question
Summary of Terminology