Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

DISC 420 – Section 2

In-Class Assignment 6
Wednesday 6th Dec 2023
Duration: 30 mins
Total Marks: 10

Name: ________________________ Roll Number: ______________________

Instructions:
• You do not need to use the systems for this assignment
• This is an individual assignment
• You will submit a Word document on LMS.
• Take a few minutes to read the questions carefully.
• No Internet
• No personal Laptops/phones/devices
• This is an open-book assignment, you can use any resources available on LMS.
• No Resubmissions. No late submissions.

Part A: [ 6 marks (2 +2 +2) ] ~ 10 mins


For each case identify the following:
i. Is it a Classification or Regression problem (with justification) [1]
ii. What is the sample size? How many predictor variables? [1]

a. Case 1: The data is related to a marketing campaign for a banking institution. The goal is
to predict the influence of subscribing to a term deposit. The variables in the dataset are:
age, type of job, housing, loan, education, marital status, subscribes (yes/no). The data
consists of 750 clients.

b. Case 2: We are interested in determining if a new patient will be assigned in group A


(treatment A) or group B (treatment B). The dataset has following variables: stage of the
disease, patient age, blood pressure, Body Mass Index (BMI), smoking cigarettes
(yes/no), group (A or B). The data consists of 650 patients in a hospital.

c. Case 3: We aim to measure the effects of various social conditions on 1500 individuals'
mental health outcome measure. The outcome measure is discrete and ranges from 0 – 5
where 5 represents good mental health and 0 represents poor mental health. For the
data collection survey, respondents provided information on aspects of their daily lives,
including economic obligations (child care, medical care, food, clothing, and bills) and
health and well-being (amount of exercise, height, and weight, whether they smoked).

In addition, respondents also answered questions related to household and family, such
as how many people lived in the household, and what kind of child care they used.
Demographic information on respondents includes marital status, education, birth year,
race, religion, and income.
DISC 420 – Section 2
In-Class Assignment 6
Wednesday 6th Dec 2023
Duration: 30 mins
Total Marks: 10

Part B: [4 marks] ~ 15 mins

KNN Algorithm is implemented on the above data with two classes – 0 and 1.
i. What are the key differences that you notice about the classification boundaries made
by the three models below (make one comment per model)? [3]
ii. Which model would you choose from the three below? [1]

K= 1:
DISC 420 – Section 2
In-Class Assignment 6
Wednesday 6th Dec 2023
Duration: 30 mins
Total Marks: 10

K = 15:

K = 101:

End of Assignment

You might also like