Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Natural Language Processing

Assignment 4
Type of Questions: MCQ

Number of Questions: 8 Total Marks: 8 × 1

Question 1: What are the advantages of Maximum Entropy Markov Models (MEMM)
over Hidden Markov Models (HMM) for sequence tagging task?

1. Higher freedom of choosing features to represent observations in MEMMs than


HMMs

2. In MEMMs, domain knowledge can be incorporated through feature design

3. MEMMs makes a very simplistic assumption that the tag at current position
(yt ) in a sequence depends only on previous tag (yt−1 ) and the word at current
position (xt ).

4. None of these

Answer: 1, 2
Solution: 3 is a property of HMM and not MEMM.

Question 2: HMMs and Max Entropy Markov models are machine learning models
most suitable for .

1. Classifying images between dogs and cats.

2. Identifying POS tag for each word in a sentence.

3. Find the most relevant document given a query.

4. All of the above

1
Answer: 2
Solution: HMMs and MEMM models are suitable for sequence classification or tag-
ging.

Question 3: You have been asked to design a 17 sided dice with maximum possible
entropy for the probability distribution of getting different sides in a single throw.
But, there is another constraint that P (1) should be 0.5. What is the maximum
possible entropy that you can achieve under the given design constraints?

1. 10.0
2. 5.0
3. 3.0
4. 4.09

Answer: 3
Solution: Let’s define a random variable S for the side we get after throwing the
dice once. It is given that P (S = 1) = 0.5. That implies that,

P (S = 2) + P (S = 3) + · · · + P (S = 17) = 1 − P (S = 1)
1
=
2
Maximum entropy of the distribution is achieved when we distribute the probability
mass uniformly among s ∈ {2, 3, . . . , 17}.

1 1
P (S = 2) = P (S = 3) = · · · = P (S = 17) = =
2 × 16 32
Entropy,
1 1
H(S) = −0.5 ∗ log2 (0.5) − 16 ∗ log2 ( )
32 32
= 0.5 + 2.5
=3

2
Question 4: Assuming that POS tagging follow a first order Hidden Markov Model
with the following emission and transition matrices for its states, calculate P (x1 =
“ki00 , x2 = “f in00 , x3 = “yeni00 , y1 = “T 200 , y2 = “T 300 , y3 = “T 300 ).
[Hint: all POS tags are equally likely to be at the starting of a sequence.]

ki fin yeni
T1 0.1 0.1 0.8
T2 0.8 0.1 0.1
T3 0.2 0.2 0.6
T4 0.8 0.1 0.1

Table 1: Output Symbol probabilities (Emission probabilities)

Transition matrix is

T1 T2 T3 T4
T1 0.18 0.01 0.8 0.01
T2 0.9 0.0 0.05 0.05
T3 0.4 0.5 0.05 0.05
T4 0.4 0.5 0.05 0.05

Table 2: Hidden State transition matrix

1. 0
2. 8 × 10−5
3. 2.4 × 10−4
4. 6 × 10−5

Answer: 4
Solution:

P (x1 = “ki”,x2 = “f in”, x3 = “yeni00 , y1 = “T 2”, y2 = “T 3”, y3 = “T 300 )


=P (y1 = T 2)P (x1 = ki|y1 = T 2)×
P (y2 = T 3|y1 = T 2)P (x2 = f in|y2 = T 3)×
P (y3 = T 3|y2 = T 3)P (x3 = yeni|y3 = T 3)
=(1/4) ∗ 0.8 ∗ 0.05 ∗ 0.2 ∗ 0.05 ∗ 0.6
=6 × 10−5

3
Question 5: In the question above, what is the probability of following transition?

P (y3 = T 3|y1 = T 4, y2 = T 1, x1 = ki, x2 = f in)

1. 0
2. 0.32
3. 0.8
4. 0.0064

Answer: 3
Solution: Since it is Hidden Markov Model, hidden state y3 depends only on y2.

P (y3 = T 3|y1 = T 4, y2 = T 1, x1 = ki, x2 = f in) =P (y3 = T 3|y2 = T 1)


=0.8

Question 6: Consider the Markov Chain:

Figure 1: Example Markov Chain (at t = 0, P (ST ART ) = 1.0)

Which among the following sequences can NOT be generated by this Markov Chain?

4
1. START w1 w1 w2 w3 w3

2. START w1 w1 w1 w1 w1

3. START w1 w2 w3 w1

4. START w1 w1 w2 w3 w2 w1

Answer: 3, 4
Solution: Sequences that have probabilities higher than zero can be generated.

Question 7: Assign POS tag sequence to the sentence “ki fin yeni” using Maximum-
Entropy Markov Model with beam search (beam size=2). The features are given
below and all features have a weight of 1.0.

f1 = (ti = T 1 & ti−1 = T 1 or T 2)


f2 = (ti = T 2 & ti−1 = N U LL)
f3 = (ti = T 3 or T 4 & ti−1 = T 2)
f4 = (wi = yeni & ti = T 1 & ti−1 = T 3)
f5 = (wi = yeni & ti = T 4 & ti−1 = T 4)
f6 = (wi = f in & ti = T 4 & ti−1 = T 2)

Data for Q7 and Q8.: The POS tags allowed for the words used here are mentioned
below:

ki : {T 1, T 2}
f in : {T 2, T 3, T 4}
yeni : {T 1, T 4}

1. T2, T3, T1

2. T2, T2, T4

3. T1, T4, T4

4. T2, T4, T4

5
Answer: 4
Solution:

Question 8: In the previous question, what is the probability of the POS tag se-
quence found using beam-search?

1. 0.355

2. 0.486

3. 0.179
e
4. 1+e

Answer: 1
Solution:

You might also like