Professional Documents
Culture Documents
Introduction To Machine Learning Week 1 Assignment 1 Graded
Introduction To Machine Learning Week 1 Assignment 1 Graded
Week 1: Assignment 1
1). Which of the following is a supervised learning problem?
Grouping related documents from an unannotated corpus.
Predicting credit approval based on historical data.
Predicting if a new image has cat or dog based on the historical data of
other images of cats and dogs, where you are supplied the information
about which image is cat or dog.
Fingerprint recognition of a particular person used in biometric
attendance from the fingerprint data of various other people and that
particular person
Answer: The supervised learning problem among the options you provided is:
In supervised learning, we have a labeled dataset where the input features and
their corresponding outputs are known. In this case, the historical data would
include information about credit applications, such as the applicant's
attributes (e.g., income, age, credit score) and whether their application was
approved or denied. The goal is to build a model that can predict credit
approval for new, unseen applications based on the patterns learned from the
labeled historical data.
• Predicting if a user would like to listen to a newly released song or not based on
historical data is a classification problem.
• Predicting the confirmation probability (in fraction) of your train ticket whose
current status is waiting list based on historical data is a regression problem.
• Predicting if a patient has diabetes or not based on historical medical records is
a classification problem.
• Predicting if a customer is satisfied or unsatisfied from the product purchased
from ecommerce website using the the reviews he/she wrote for the purchased
product is a classification problem.
4). Which of the following is an unsupervised learning task?
Group audio files based on language of the speakers.
Group applicants to a university based on their nationality.
Predict a student’s performance in the final exams.
Predict the trajectory of a meteorite.
• Gender of a person
• Ethnicity of a person
• The color of the curtains in your room
• Number of legs an animal
1/6
5/6
2/3
1/2
2/6
5/8
None of the above
Answer: The probability that max(X,Y)>3 can be calculated as follows:
P(max(X,Y)>3) = 1 - P(max(X,Y)<=3)
Since X and Y are uniformly distributed random variables over the interval [0,4] and [0,6]
respectively and are independent events, we can use the following formula:
P(max(X,Y)<=k) = P(X<=k) * P(Y<=k)
where k is a constant.
Therefore,
P(max(X,Y)<=3) = P(X<=3) * P(Y<=3)
P(X<=3) = 3/4 (since X is uniformly distributed over [0,4])
P(Y<=3) = 1/2 (since Y is uniformly distributed over [0,6])
So,
P(max(X,Y)<=3) = (3/4) * (1/2) = 3/8
Therefore,
P(max(X,Y)>3) = 1 - P(max(X,Y)<=3)
= 1 - 3/8
= 5/8
Therefore, the probability that max(X,Y)>3 is 5/8.
8). Find the mean of 0-1 loss for the given predictions:
Answer: The mean of 0-1 loss for the given predictions can be calculated as follows:
Let y_i be the true label and y_i_hat be the predicted label.
0-1 loss is defined as:
L(y_i, y_i_hat) = 1 if y_i != y_i_hat and 0 otherwise.
Therefore, the 0-1 loss for the given predictions is:
L(1, 1) = 0
L(0, 0) = 0
L(1, 0) = 1
L(0, 1) = 1
The mean of 0-1 loss is defined as:
mean(L(y_i, y_i_hat)) = (1/n) * sum(L(y_i, y_i_hat))
where n is the number of predictions.
Therefore,
mean(L(y_i, y_i_hat)) = (1/4) * (L(1, 1) + L(0, 0) + L(1, 0) + L(0, 1))
= (1/4) * (0 + 0 + 1 + 1)
= (1/4) * 2
= 0.5
Therefore, the mean of 0-1 loss for the given predictions is 0.5.
9), Which of the following statements are true? Check all that apply.
A model with more parameters is more prone to overfitting and typically
has higher variance.
If a learning algorithm is suffering from high bias, only adding more
training examples may not improve the test error significantly.
When debugging learning algorithms, it is useful to plot a learning curve
to understand if there is a high bias or high variance problem.
If a neural network has much lower training error than test error, then
adding more layers will help bring the test error down because we can fit
the test set better.
• If a neural network has much lower training error than test error, then adding
more layers will help bring the test error down because we can fit the test set
better.
• If a learning algorithm is suffering from high bias, only adding more training
examples may not improve the test error significantly. This statement is partially
true because adding more training examples may not always help improve the
test error significantly but it can help in some cases.
E[f^(x)]−f(x),E[(E[f^(x)]−f^(x))2]�[�^(�)]−�(�),�[(�[�^(�)]−�^(�))2]
E[f^(x)]−f(x),E[(E[f^(x)]−f^(x))]2�[�^(�)]−�(�),�[(�[�^(�)]−�^(�))]2
(E[f^(x)]−f(x))2,E[(E[f^(x)]−f^(x))2](�[�^(�)]−�(�))2,�[(�[�^(�)]−�^(�))2]
(E[f^(x)]−f(x))2,E[(E[f^(x)]−f^(x))