Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Natural Language Processing

Assignment 11
Type of Questions: MCQ

Number of Questions: 8 Total Marks: 8

Question 1: What are the ideal qualities of a summary in automatic extractive text
summarization?

1. Maximum relevance of summary to the theme of the document, Maximum


redundancy in information held by the summary.

2. Minimum relevance of summary to the theme of the document, Minimum


redundancy in information held by the summary.

3. Minimum relevance of summary to the theme of the document, Maximum


redundancy in information held by the summary.

4. Maximum relevance of summary to the theme of the document, Minimum


redundancy in information held by the summary.

Answer: 4
Solution: Maximum relevance to the theme is the basic goal of any summarization
method. Redundancy of the information stored in a summary should be minimum to
make sure that the summary is brief.

Question 2: Imagine you are summarizing a document of 5 sentences using LEXRANK


algorithm. You have found the following similarity matrix M between between the
sentences. M [i, j] is the similarity between sentences ti and tj .
 
0.000 0.500 0.400 0.200 0.000
0.500 0.000 0.020 0.000 0.100 
 
M =  0.400 0.020 0.000 0.000 0.400 
 (1)
0.200 0.000 0.000 0.000 0.300 
0.000 0.100 0.400 0.300 0.000

What is the row stochastic version of this matrix?

1
1.  
0.000 0.500 0.400 0.200 0.000
0.500 0.000 0.020 0.000 0.100 
 
0.400
M̃ =  0.020 0.000 0.000 0.400 

0.200 0.000 0.000 0.000 0.300 
0.000 0.100 0.400 0.300 0.000

2.  
0.000 0.455 0.364 0.182 0.000
0.806 0.000 0.032 0.000 0.161 
 
0.488
M̃ =  0.024 0.000 0.000 0.488 

0.400 0.000 0.000 0.000 0.600 
0.000 0.125 0.500 0.375 0.000

3.  
0.000 0.806 0.488 0.400 0.000
0.455 0.000 0.024 0.000 0.125 
 
0.364
M̃ =  0.032 0.000 0.000 0.500 

0.182 0.000 0.000 0.000 0.375 
0.000 0.161 0.488 0.600 0.000

4. None of the above

Answer: 2
Solution: For making M row-stochastic, one needs to normalize individual rows by
the sum of its elements.
Mi,j
M̃i,j = P
j Mi,j

Question 3: Find the PageRank values for the 5 sentence nodes mentioned in above
question (with similarity values as per matrix in Equation 1). Use µ = 1.0.
[HINT: Use an online matrix multiplier or MATLAB/Python for solving this prob-
lem.]
 
1. 0.286 0.161 0.214 0.130 0.208
 
2. 0.251 0.173 0.201 0.163 0.212
 
3. 0.061 0.072 0.226 0.324 0.317

2
 
4. 0.310 0.031 0.207 0.348 0.104

Answer: 1
Solution: Initial values of v can be any randomly initialized probability vector.

v = [0.2, 0.2, 0.2, 0.2, 0.2]


µ = 1.0
#Repeat until convergence
v = µ(v M̃ ) + (1 − µ)/5

Question 4: Find the PageRank values for the 5 sentence nodes mentioned in above
question (with similarity values as per matrix in Equation 1). Use µ = 0.5.
[HINT: Use an online matrix multiplier or MATLAB/Python for solving this prob-
lem.]
 
1. 0.286 0.161 0.214 0.130 0.208
 
2. 0.251 0.173 0.201 0.163 0.212
 
3. 0.061 0.072 0.226 0.324 0.317
 
4. 0.310 0.031 0.207 0.348 0.104

Answer: 2
Solution: Initial values of v can be any randomly initialized probability vector.

v = [0.2, 0.2, 0.2, 0.2, 0.2]


µ = 0.5
#Repeat until convergence
v = µ(v M̃ ) + (1 − µ)/5

Question 5: Use Maximal Marginal Relevance to find the informativeness of the


first two sentences for the summary constructed based on solution from Question 4.
Use µ = 0.5 (from previous question), M from Equation 1 (from Question 2), and
λ = 1.

1. 0.251, 0.012

3
2. 0.251, 0.212

3. 0.286, 0.208

4. 0.324, 0.317

Answer: 2
Solution: Refer to Lecture 52

Question 6: What are the goals of using PageRank algorithm and Maximal Marginal
Relevance for summarization?

1. Ranking all possible summaries, Finding relevance of each sentence to others in


a document

2. Finding most informative sentences, Increasing diversity of selected sentences

3. Finding informative word nodes in a document graph, Increasing relevance of


the summary sentences to topic words

4. None of these are correct

Answer: 2
Solution:

Question 7: Consider the following test case for summarizing a document with 4
sentences, t1 , t2 , t3 , t4 . Their relevance scores are 3, 3, 2.8, and2.8, respectively. The
redundancy scores between each pair of sentences are given by the matrix below.
 
0 0.5 0.5 0.5
0.5 0 0.2 0.1
Red = 0.5 0.2 0

0
0.5 0.1 0 0

If the length of each sentence is L and we want a summary of length 2L, find out the
optimal extractive summary.

1. t1 , t3

4
2. t3 , t4

3. t2 , t4

4. t1 , t2

Answer: 3
Solution: To find the best summary with two sentences, enumerate all the pairs and
compute their summary scores using the formula to find the best extractive summary.
t2 , t4 has the highest summary score of 5.7.
X X
s(S) = Rel(i) − Red(i, j)
ti ∈S ti ,tj ∈S;i<j

Question 8: Consider the system generated summary (S) and the reference sum-
maries as follows:
S : the cat was sleeping.
R1 : the cat was sleeping under the bed
R2 : the cat was found under the bed
R3 : the cat was under the bed, sleeping.
What are the ROUGE-1 and ROUGE-2 recall values for the give summary with
respect to the references?

1. 1.0, 1.0

2. 0.524, 0.389

3. 0.524, 0.611

4. 0.579, 0.366

Answer: 2
Solution:
4+3+4 11
ROUGE-1 = 7+7+7
= 21
≈ 0.524
3+2+2 7
ROUGE-2 = 6+6+6
= 18
≈ 0.389

You might also like