ML - Int-1 Solutions

Mid Term Solution
MCQ
1. (B) P(B|A) decreases
The conditional probability equation for joint probability distribution;
P(A, B) = P(A|B)P(B) = P(B|A)P(A).
In P(A, B) = P(B|A)P(A) if P(A) increases then, only the decrease in P(B|A) will result in decrease of P(A,
B).
2. (B) and (D)

On smaller datasets, variance is a concern since even small changes in the training set may change the
optimal parameters significantly. Hence, a high bias/ low variance classifier would be preferred. On the
other hand, with a large dataset, since we have enough points to represent the data distribution
accurately, variance is not of much concern. Hence, one would go for the classifier with low bias even
though it has higher variance.
3.A) L1 norm.
4. (b) P(E), P(H), P(E|H)

This is Bayes’ rule;
P(H|E F) = (P(E F|H)*P(H)) / P(E F)
5. (d) decrease variance

Averaging out the predictions of multiple classifiers will drastically reduce the variance.
Averaging is not specific to decision trees; it can work with many different learning algorithms. But it works
particularly well with decision trees.
6. (b) lasso
For feature selection, we would prefer to use lasso since solving the optimization problem when using
lasso will cause some of the coefficients to be exactly zero (depending of course on the data) whereas
with ridge regression, the magnitude of the coefficients will be reduced, but won't go down to zero.
7. A)
take the derivative of the given function, equate to zero and solve for u, as it is given, we want to minimize
the squared error
2|x1-u| + 2|x2-u| + 2|x3-u| +....2|xn-u| = 0
u = (x1 + x2…..xn ) / n = mean(x)
8. B)
λ is a scalar value, which determines the magnitude of eigen vector.
Av = λv given, To check whether v is a eigen vector of A2, Consider A2v = A(Av) = A(λv) = λ(Av) = λ(λx)
= λ2x. Hence A50v = λ50 v
9. A)
for multiplication and addition the calculation is (64*200 + 200 * 10) = 14800
10. A
Explanation is as derived in this image
11. C
take vectors whose eigen values are maximum as it captured the spread of data more. so v1, v3 are the
vectors with λ1 = 6, λ4 = 6
12. D
Expected Gain is : 75% chance is number above 35 to come, and 35% chance you loose (win chance *
amount + loss chance * -amount)
A - (0.75 * 1 + 0.35 * -1) * 100 = 40
B - (0.75 * 10 + 0.35 * -10) * 10= 40
C - (0.75 * 100 + 0.35 * -100) * 1= 40
All have same expected gain, so ans is D
13. D
¼ - to select same number, ¾ not to select same number for friend 1 and friend 2. if friend 1 and friend 2
select same number, friend 3 to match with same number has ¼ probability. so same number chose
probability amongst three friends is ¼ * ¼ = 1/16If friend 1 and friend 2 did not select same number,
probability of friend 3 to match either friend1 or friend 2 number is 2/4 . So total Yes probability to choose
same number is 1/16 + (¾ * 2/4) = 7/16
14. C
15. D
all the conditions are part of Random Forest training procedure

Mid Term (Subjective Solution)
1.
The formula for entropy is:
Answer is -(5/8 log(5/8) + 3/8 log(3/8))
= 0.287
2. 0.63
𝑃(𝑇/𝑋)∗𝑃(𝑋)
P(X|T) = 𝑃(𝑇)
0.99∗0.05
=
0.99∗0.05+0.03∗0.95
0.0495
=
0.078
= 0.63
3. Give 1 mark each even if the formula is missing but the final answer matches, otherwise if the final answer is
wrong but formula is written correctly then give 0.5 marks each.
Accuracy: 𝑇𝑃 + 𝑇𝑁 120/150 = 4/5 = 0.8

𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁
Misclassification Rate: 1 – Accuracy 0.2
True Positive Rate: 𝑇𝑃 70/80 = 7/8 = 0.875
𝑇𝑃 + 𝐹𝑁
False Positive Rate: 𝐹𝑃 20/70 = 2/7 = 0.285
𝐹𝑃 + 𝑇𝑁
Precision: 𝑇𝑃 70/90 = 7/9 = 0.77
𝑇𝑃 + 𝐹𝑃
4.
Bias occurs when the predicted values are further from the actual values. Low bias indicates a model where the
prediction values are very close to the actual ones. High bias can cause an algorithm to miss the relevant relations
between features and target outputs.
Variance refers to the amount the target model will change when trained with different training data. For a good
model, the variance should be minimized. High variance can cause an algorithm to model the random noise in the
training data rather than the intended outputs.

ML - Int-1 Solutions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ML - Int-1 Solutions

Uploaded by

Copyright:

Available Formats

Mid Term Solution

2. (B) and (D)

4. (b) P(E), P(H), P(E|H)

5. (d) decrease variance

2|x1-u| + 2|x2-u| + 2|x3-u| +....2|xn-u| = 0

u = (x1 + x2…..xn ) / n = mean(x)

λ is a scalar value, which determines the magnitude of eigen vector.

Explanation is as derived in this image

all the conditions are part of Random Forest training procedure

The formula for entropy is:

Answer is -(5/8 log(5/8) + 3/8 log(3/8))

Accuracy: 𝑇𝑃 + 𝑇𝑁 120/150 = 4/5 = 0.8

You might also like