B9DA104 - 1920 - TMD3 - First Sitting Exam Paper

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

B9DA104

QQI
MSC Data Analytics

AUTUMN 2020 EXAMINATIONS

Module Code: B9DA104

Module Description: Machine learning

Examiner: Abhishek Kaushik

Internal Moderator: Terry Hoare

External Examiner: Andrew Parnell

Date: Thursday, 20th August 2020


Time: 09:30-11:30

INSTRUCTIONS TO CANDIDATES
Time allowed is 2 hours
Question 1 is compulsory
Answer any 2 other Questions.

Page 1 of 3
B9DA104

Question 1
Define the following (minimum 70 words each) with example.
a) KNN
b) Clustering
c) Types of Supervised Learning
d) Model selection
e) F1 score
(10*5=50 Marks)

Question 2

a) Write the pseudo code or python code for applying a Multilayer perceptron
(MLP) on multiclassification on sentiment analysis.
b) Explain the trade off between bias and variance with examples and how it
affected by data imbalance.
c) Draw the diagram of the human neuron and label its parts.
(10*2+5=25 Marks)

Question 3
a) What are the six steps of Machine learning cycle? Explain with the help of example
(minimum 200 words).

b) Assume we have a set of data from patients who have visited hospital during the
year 2020. A set of features (e.g., temperature, height) have been also extracted
for each patient. Our goal is to decide whether a new visiting patient has any of
diabetes, heart disease, or Alzheimer (a patient can have one or more of these
diseases)

I. We have decided to use a neural network to solve this problem. We have two
choices: either to train a separate neural network for each of the diseases or
to train a single neural network with one output neuron for each disease, but
with a shared hidden layer. Which method do you prefer? Justify your answer.

II. Some patient features are expensive to collect (e.g., brain scans) whereas
others are not (e.g., temperature). Therefore, we have decided to first ask our
classification algorithm to predict whether a patient has a disease, and if the
classifier is 80% confident that the patient has a disease, then we will do
additional examinations to collect additional patient features In this case,
which classification methods do you recommend: neural networks, decision
tree, or naive Bayes? Justify your answer in one or two sentences.

c) Suppose we clustered a set of N data points using two different clustering


algorithms: k-means and Gaussian mixtures. In both cases we obtained 5 clusters,
and, in both cases, the centres of the clusters are exactly the same. Can 3 points
that are assigned to different clusters in the k means solution be assigned to the
same cluster in the Gaussian mixture solution? If no, explain. If so, sketch an
example or explain in 2-3 sentences.

Page 2 of 3
B9DA104

d) Explain the principle of the gradient descent algorithm. Accompany your


explanation with a diagram. Explain the use of all the terms and constants that
you introduce and comment on the range of values that they can take.

(5*5=25 Marks)

Question 4
a) Write a short note on the following (minimum 100 words each)
1. Automated Feature selection
2. Exploratory Data Analysis

b) Explain an example of the classification with two input real value and one discreet
output value on given below each scenario.

1. Gaussian Bayes Classifier (GBC) do well in traing data but badly on test
data.
2. GBC do well with testing data, but Decision tree (DT) do badly with the test
data.
3. GBC would do badly over on more than two by third of the test data, But
DT would do almost perfectly on test data.

(5*5=25 Marks)

END OF EXAMINATION

Page 3 of 3

You might also like