Assignment 11

Natural Language Processing
Assignment 11
Type of Questions: MCQ
Number of Questions: 8 Total Marks: 8
Question 1: What are the ideal qualities of a summary in automatic extractive text
summarization?
1. Maximum relevance of summary to the theme of the document, Maximum

redundancy in information held by the summary.
2. Minimum relevance of summary to the theme of the document, Minimum

3. Minimum relevance of summary to the theme of the document, Maximum

4. Maximum relevance of summary to the theme of the document, Minimum

Answer: 4
Solution: Maximum relevance to the theme is the basic goal of any summarization
method. Redundancy of the information stored in a summary should be minimum to
make sure that the summary is brief.
Question 2: Imagine you are summarizing a document of 5 sentences using LEXRANK

algorithm. You have found the following similarity matrix M between between the
sentences. M [i, j] is the similarity between sentences ti and tj .
 
0.000 0.500 0.400 0.200 0.000
0.500 0.000 0.020 0.000 0.100 
 
M =  0.400 0.020 0.000 0.000 0.400 
 (1)
0.200 0.000 0.000 0.000 0.300 
0.000 0.100 0.400 0.300 0.000
What is the row stochastic version of this matrix?
1
1.  
0.000 0.500 0.400 0.200 0.000
0.500 0.000 0.020 0.000 0.100 
 
0.400
M̃ =  0.020 0.000 0.000 0.400 

0.200 0.000 0.000 0.000 0.300 
0.000 0.100 0.400 0.300 0.000
2.  
0.000 0.455 0.364 0.182 0.000
0.806 0.000 0.032 0.000 0.161 
 
0.488
M̃ =  0.024 0.000 0.000 0.488 

0.400 0.000 0.000 0.000 0.600 
0.000 0.125 0.500 0.375 0.000
3.  
0.000 0.806 0.488 0.400 0.000
0.455 0.000 0.024 0.000 0.125 
 
0.364
M̃ =  0.032 0.000 0.000 0.500 

0.182 0.000 0.000 0.000 0.375 
0.000 0.161 0.488 0.600 0.000
4. None of the above
Answer: 2
Solution: For making M row-stochastic, one needs to normalize individual rows by
the sum of its elements.
Mi,j
M̃i,j = P
j Mi,j
Question 3: Find the PageRank values for the 5 sentence nodes mentioned in above
question (with similarity values as per matrix in Equation 1). Use µ = 1.0.
[HINT: Use an online matrix multiplier or MATLAB/Python for solving this prob-
lem.]

1. 0.286 0.161 0.214 0.130 0.208

2. 0.251 0.173 0.201 0.163 0.212

3. 0.061 0.072 0.226 0.324 0.317
2

4. 0.310 0.031 0.207 0.348 0.104
Answer: 1
Solution: Initial values of v can be any randomly initialized probability vector.
v = [0.2, 0.2, 0.2, 0.2, 0.2]

µ = 1.0
#Repeat until convergence
v = µ(v M̃ ) + (1 − µ)/5
Question 4: Find the PageRank values for the 5 sentence nodes mentioned in above
question (with similarity values as per matrix in Equation 1). Use µ = 0.5.
[HINT: Use an online matrix multiplier or MATLAB/Python for solving this prob-
lem.]

1. 0.286 0.161 0.214 0.130 0.208

2. 0.251 0.173 0.201 0.163 0.212

3. 0.061 0.072 0.226 0.324 0.317

4. 0.310 0.031 0.207 0.348 0.104
Answer: 2
Solution: Initial values of v can be any randomly initialized probability vector.
v = [0.2, 0.2, 0.2, 0.2, 0.2]

µ = 0.5
#Repeat until convergence
v = µ(v M̃ ) + (1 − µ)/5
Question 5: Use Maximal Marginal Relevance to find the informativeness of the

first two sentences for the summary constructed based on solution from Question 4.
Use µ = 0.5 (from previous question), M from Equation 1 (from Question 2), and
λ = 1.
1. 0.251, 0.012
3
2. 0.251, 0.212
3. 0.286, 0.208
4. 0.324, 0.317
Answer: 2
Solution: Refer to Lecture 52
Question 6: What are the goals of using PageRank algorithm and Maximal Marginal
Relevance for summarization?
1. Ranking all possible summaries, Finding relevance of each sentence to others in

a document
2. Finding most informative sentences, Increasing diversity of selected sentences
3. Finding informative word nodes in a document graph, Increasing relevance of

the summary sentences to topic words
4. None of these are correct
Answer: 2
Solution:
Question 7: Consider the following test case for summarizing a document with 4
sentences, t1 , t2 , t3 , t4 . Their relevance scores are 3, 3, 2.8, and2.8, respectively. The
redundancy scores between each pair of sentences are given by the matrix below.
 
0 0.5 0.5 0.5
0.5 0 0.2 0.1
Red = 0.5 0.2 0

0
0.5 0.1 0 0
If the length of each sentence is L and we want a summary of length 2L, find out the
optimal extractive summary.
1. t1 , t3
4
2. t3 , t4
3. t2 , t4
4. t1 , t2
Answer: 3
Solution: To find the best summary with two sentences, enumerate all the pairs and
compute their summary scores using the formula to find the best extractive summary.
t2 , t4 has the highest summary score of 5.7.
X X
s(S) = Rel(i) − Red(i, j)
ti ∈S ti ,tj ∈S;i<j
Question 8: Consider the system generated summary (S) and the reference sum-
maries as follows:
S : the cat was sleeping.
R1 : the cat was sleeping under the bed
R2 : the cat was found under the bed
R3 : the cat was under the bed, sleeping.
What are the ROUGE-1 and ROUGE-2 recall values for the give summary with
respect to the references?
1. 1.0, 1.0
2. 0.524, 0.389
3. 0.524, 0.611
4. 0.579, 0.366
Answer: 2
Solution:
4+3+4 11
ROUGE-1 = 7+7+7
= 21
≈ 0.524
3+2+2 7
ROUGE-2 = 6+6+6
= 18
≈ 0.389

Assignment 11

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assignment 11

Uploaded by

Copyright:

Available Formats

Natural Language Processing

Number of Questions: 8 Total Marks: 8

1. Maximum relevance of summary to the theme of the document, Maximum

2. Minimum relevance of summary to the theme of the document, Minimum

3. Minimum relevance of summary to the theme of the document, Maximum

4. Maximum relevance of summary to the theme of the document, Minimum

Question 2: Imagine you are summarizing a document of 5 sentences using LEXRANK

What is the row stochastic version of this matrix?

4. None of the above

v = [0.2, 0.2, 0.2, 0.2, 0.2]

v = [0.2, 0.2, 0.2, 0.2, 0.2]

Question 5: Use Maximal Marginal Relevance to find the informativeness of the

1. Ranking all possible summaries, Finding relevance of each sentence to others in

2. Finding most informative sentences, Increasing diversity of selected sentences

3. Finding informative word nodes in a document graph, Increasing relevance of

4. None of these are correct

You might also like