Sample Question Paper

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Q 1) From an insurance company, people have purchased the travel insurance to be safe and

experiencing the world. The below dataset describes the listing purchase transactions in 2015. List
at least 6 issues with the below available dataset which are required for Data Pre-processing. [3M]

Name Age Date of Professio Date of Policy Sum


Purchase n Birth Category Insured
Ram 34 15-Jan-2015 Lawyer Feb 24, 1981 SUPER ₹ 1200
Laxman 33 27-Jan-2015 Mar 27, 1982 ECO $ 89
Rama 34 15-Jan-2015 Lawyer Feb 24, 1981 SUPER ₹ 1200
Mahesh 32 30-Jan-2015 Engineer Nov 25,1982 ECO ₹ 4877
Sita n/a 22-Jan-2015 Designer May 22,1990 SUPER ₹0

[ANS]:
i) 1st and 3rd row are duplicate, should be removed.
ii) ‘n/a’ replace with Mean and blank profession should replaced with Mode.
iii) Age & Date of birth are duplicate columns
iv) Sum Insured amounts are in different currencies, should be in same currency.
v) Date of Purchase & Date of Birth are in different format
₹ 0 Sum Assured should be replaced with the mean value

Q.3) Compare the two models built on the same dataset. Which model is not preferred?
Justify your answers briefly with respect to bias and variance interpretation for the below
models.
[6M]

Learning Curve - Linear Learning Curve - Linear


Regression Model A Regression Model B
2 2
1.5 1.5
RMSE

RMSE

1 1
0.5 0.5
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Training Set Size Training Set Size

Marking Scheme:
Bias Interpretation – 1.5m
Variance Interpretation – 1.5m
Correct answer – 1.5m
Under/Over fit Identification – 1.5m
[ANS]:
A-Model is NOT preferred
A High Bias | Low Variance | Underfit
B Average or relatively low bias | Average or relatively high variance(but by small
factor) | Neither underfit nor overfit | Better Model

Q 4) Use Naïve Bayes classification technique to predict the crash severity for the new
instance: <Weather condition: “Good”, Driver’s condition = “Sober”, Traffic Violation =“
Exceed Speed Limit”, Seat Belt = “No” >.
[6M]
Weather Driver’s Seat Crash
condition condition Traffic Violation Belt Severity
Good Alcohol-impaired Exceed Speed Limit No Major
Bad Sober None Yes Minor
Good Sober Disobeys traffic signal Yes Minor
Good Sober Exceed Speed Limit Yes Major
Bad Sober Disobeys traffic signal No Major
Good Alcohol-impaired Disobeys traffic signal Yes Minor
Bad Alcohol-impaired None Yes Major
Good Sober Disobeys traffic signal Yes Major

Marking Scheme
Laplace for Traffic Violation & propagation =1.5m
Laplace for Seat Belt & propagation =1.5m
Correct Value for P(Major | X) = 1m
Correct Value for P(Minor | X) = 1m
Correct Inference & answer = 1m
[ANS]
X = <Weather Condition: “Good”, Driver’s Condition = “Sober”, Traffic Violation =
“Exceeds Speed Limit”, Seat Belt =“No”>

P(Major | X) = 5/8 * 3/5 * 3/5 * 2+1/5+3 * 2+1/5+2


= 5/8 * 3/5 * 3/5 * 3/8 * 3/7 = 405/11200 = 0.0361
P(Minor| X) = 3/8 * 2/3* 2/3 * 0+1/3+3 * 0+1/3+2
= 3/8 * 2/3 * 2/3 * 1/6 * 1/5 = 12/2160 = 0.0055
P(Major | X) > P(Minor | X)
Ans : Crash severity = MAJOR

Q 5) Consider the hypothesis function ; where <x1,x2> are input data features. w = <w0,
w1,.., w4> = < -36, 0, 0, 4, 9>. Output of the logistic regression is given by: y = 1 / (1+exp(-
h(x))
Answer the questions below: [2+2+2=6M]

a) What is the equation of the decision boundary g(x1, x2)? Note: y = 0.5 on decision
boundary.

[ANS]
4x1^2+9x2^2=36

b) What will be the shape of the decision surface in (x1, x2) plane? Draw the decision surface
in (x1,x2) plane.

[ANS] Elliptical with centered at (0,0)

c) What will be value of y for (x1,x2)=(3,3)?

[ANS]
y = 1 (approximately)

Q 6) Design a maximum margin SVM classifier for the following dataset. [2+1+2+1 = 6M]
(a) Suggest a suitable feature transformation that allows for linear classification.
[ANS] Choose, z=(x1-1)^2+x2^2

(b) Draw the transformed data points and identify the support vectors.
[ANS] Support vectors, z=1 and z=2

(c) What is the equation of the maximum margin classifier in transformed space?
[ANS] z=1.5

What is the equation of the maximum margin classifier in the original (x1, x2) space?
[ANS] (x1-1)^2+x2^2=1.5

You might also like