Professional Documents
Culture Documents
Mba ZG536 Ec-3r First Sem 2023-2024
Mba ZG536 Ec-3r First Sem 2023-2024
Comprehensive Examination
(EC-3 Regular)
Q1. Set A
import pandas as pd
…
features = ['AGE', 'COUNTRY', 'PLAYING ROLE']
With reference to the above piece of code what kind to encoding is being carried out by the
function “get_dummies”. Explain the type of encoding in details and why this encoding is
required. (5 marks)
Q1. Set B
import pandas as pd
…
features = ['AGE', 'COUNTRY', 'PLAYING ROLE']
What is the significance of “drop_first = True” parameter in the above code, why is it
required? Explain in detail. (5 marks)
Q1. Set C
import pandas as pd
…
features = ['AGE', 'COUNTRY', 'PLAYING ROLE']
Q2. Set A
How are unknown datapoints classified using K Nearest Neighbour (KNN) algorithm? What
is the rule of thumb while fixing the value of K? (5 Marks)
Q2. Set B
What technique does K Nearest Neighbour (KNN) algorithm use to classify unknown
datapoints? What is the rule of thumb while fixing the value of K? (5 Marks)
Q2. Set C
Why is KNN algorithm is a type of lazy learning? Explain. What does “distance” refer to in
KNN algorithm, what are the different metrics that can be used? (5 marks)
Q3. Set A
Why is K-Means algorithm an unsupervised learning algorithm? How is the value of K
determined, explain? (5 Marks)
Q3. Set B
What is the problem when we use accuracy as a metric to evaluate a model built over a highly
imbalanced dataset? Explain. (5 Marks)
Q3. Set C
What is Hyperparameter tuning in Machine Learning? Explain how hyperparameter tuning is
carried out using python. (5 marks)
Q4 Set A.
Answer the following questions with respect to the following figure. (10 Marks)
Q4. Set B.
Answer the following questions with respect to the following figure. (10 Marks)
a) Suppose the above figures shows decision boundaries for KNN and Logistic
regression model applied on a 2D dataset. Answer which decision boundary (A and
B) is for which algorithm (KNN and Logistic Regression). Explain why. [4 Marks]
c) What would be the equation of the decision boundary for the ML algorithm referred
to in question b)? Explain [2 Marks]
d) With reference of question b), Explain why this function is used to fit the data rather
than a linear line/hyperplane? [3 Marks]
Q4. Set C.
Answer the following questions with respect to the following figure. (10 Marks)
a) What function is denoted by the following figure? [1 Mark]
b) In which Machine Learning algorithm is it used? [1 Mark]
c) Explain why this function is used to fit the data rather than a linear line/hyperplane?
[3 Marks]
d) Explain the concept of threshold, and a way of determining an optimal threshold. [5
Marks]
Q5. Set A.
Explain what is Entropy with respect to decision tree. Which of the following state has the
highest impurity? Calculate entropy for each of states in the figure. Note that the colours
represents classes of the data-points in the figure. (10 Marks)
Q5. Set B
Explain what is Entropy with respect to decision tree. Which of the following state has the
highest impurity? Calculate entropy for each of states in the figure. Note that the colours
represents classes of the data-points in the figure. (10 Marks)
Q5. Set C
Explain what is Entropy with respect to decision tree. Which of the following state has the
highest impurity? Calculate entropy for each of states in the figure. Note that the colours
represents classes of the data-points in the figure. (10 Marks)
Q6. Set A.
Give examples of use cases where each of the following metrics should be preferred over the
other for model evaluation (10 Marks)
1. Precision
2. Recall
Q6. Set B.
Describe the steps of k-means clustering algorithm? How is the value of k determined? (10
Marks)
Q6. Set C.
Give examples of use cases where each of the following metrics should be preferred over the
other for model evaluation (10 Marks)
1. Precision
2. Recall