Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Muhammet Fatih DİNÇ

2023776042

q1:

q2: Give three examples of AI which are not based on machine


learning. You cannot give the example discussed in Lecture 1

1. ASIMO (2000): Mostly uses Rule-Based Systems to mimic the


human moves stands as an example of AI without Machine
Learning
2. MANIAC 1 (1956): First computer to beat human in (modified
version of) chess using brute force, Shannon A type strategy
3. Automated Theorem Provers: Automated theorem provers use
formal logic to automatically prove mathematical theorems
without using Machine Learning techniques

q3: We know that in machine learning an algorithm for model


estimation has model parameters and hyperparameters. Is the model
order of a parametric model (such as the order of polynomial
regression model) a model parameter or hyperparameter? Justify
your answer

It is hyperparameter, order of the variables (hyperparameters) are


selected before the training starts and with the training, coefficient of
these variables (model parameters) are determined.

q4, a: Why are TPs decreasing as threshold increases from 0.5 to


0.95?
b-) Why are FNs increasing as threshold increases from 0.5 to 0.95?
c-) Why are FPs decreasing as threshold increases from 0.5 to 0.95?
d-) Why are TNs increasing as threshold increases from 0.5 to 0.95?

a) Since as we set the threshold higher, we make the model more


conservative, so more selective when it comes to labeling data as
true which is a reason of decrease in TPs

b) Since it's more selective to label mail as True, some spam mails
that show main characteristics of a spam mail but not 'most of the
characteristics' of a spam mail will be labeled as False.

c) False positives are decreasing as threshold increases since it is


more conservative, it'll label as 'positive' in a more 'selective' way so
TF falls.

We can explain it with an example:


In some spam mails we likely to see expressions like 'Be a billionaire
in no time'

If we make threshold so high that it's a must to include that


expression it's likely that most of our positive predictions are true, we
are less likely to label 'positive' of a non-spam since they don't have
that expression, FP decreases ( though many of the other spam
mails who don't have these expressions will fail to be labeled as
spam)

d) Since threshold increases, the model becomes more selective,


and it is less likely to label a positive class which increases predicting
the negative class which result in an increase in TN.

q5: a) Cevdet trained a model to predict the price of houses using


housing data of Ankara. The resulting model works well according to
his test data. However, the developed ML model performs poorly
when used to predict the prices of houses in the Istanbul housing
market. Explain why that is happening and suggest a way of
obtaining a better model for the Istanbul data.

Since two data distributions and underlying reasons that effect the
house prices between two cities differ, the model created for Ankara
fails for Istanbul (since we need another model).

As an example having sea sight important factor in price


determination in İstanbul while it doesn't have a place for Ankara
case (since Ankara doesn't have a sea).

So the followings should be done:

1. On Istanbul data, feature engineering should be redone since


factors differ
2. Machine learning model should be revised, since different kind of
patterns may fail to be visible with the Ankara's ML model
selection.
3. Optimization of the hyperparameters of the chosen model for the
Istanbul dataset should be done

q5-b) Cevdet is an ML lover and a very ambitious student. Next, he


developed a classification model for a multi-class classification
problem. The developed model is performing almost perfectly on
training, validation and test datasets. However, his model fails
significantly during model deployment. What may be the possible
causes?

There is one obvious possible scenario that training, validation and


test datasets were so similar to each other (or irrelelevant data with
our case) and 'overfitting' problem took place when parameters were
trying to fit these datasets.

Possible solution is to find more/different/better datasets, simplify the


model to free the model from being overfit

6. a-) Is there a trade-off between Precision and Recall? Justify


your answer

Yes, there is. when you make the threshold for classifying an instance
as positive more stringent (e.g., requiring a higher confidence score),
it tends to increase Precision but decrease Recall and vice versa

As an example, let's say we have a program to detect DSAI students


on a specific day on campus from campus cameras recordings of the
day.
Let the things contribute to the score be 'getting into department
building when there is DSAI lecture in it', 'Having conversation with
an DSAI instructor', 'Being seen with an DSAI - related book' an so on

If we set threshold so high, in other words if we only take students as


from DSAI, if he/she is seen with an DSAI hoca, and he was in Eng.
Building on lecture hours and when he was leaving the building there
was some book he/she was carrying related ML or Statistics we are
very likely get most of the labeling as 'Right' but we'll miss the other
DSAI student who don't carry books, who are late to class or left
early. This is high precision but low recall

Vice versa if we keep the threshold low and let's say students who
got the building at some time reaches the threshold then we get the
most of the DSAI students in our positive list but also there will be
students from other departments who just came to study at
department or to buy coffee, or to attend lecture of another
department will also be in detected ones list. This is low precision but
high recall

So we're to choose where to put the threshold 'high' means high


precision low recall. 'Low' means low precision high recall, they move
in different direction.

q6-b: Is it possible that Precision and Recall increase together?


Justify your answer

Yes, if we improve the features or if we use better classification


algorithm then we can improve both of them, let's say by adding
another significant variable to out regression model.

q7-a:
Test_1

Precision_1 = TP/(TP+FP) = 40/(40+10) = 0.8


Recall_1 = TP/(TP+FN) = 40/(40+60) = 0.4
F1_1 = 2(Precision Recall)/(Precision + Recall) = 2(0.8 0.4)/(0.8+0.4)
= 0.5333
TPR = Recall = 0.4

Test_2

Precision_2 = TP/(TP+FP) = 40/(40+55) = 0.4211


Recall_2 = TP/(TP+FN) = 40/(40+60) = 0.4
F1_2 = 2(Precision Recall)/(Precision + Recall) = 2(0.4211
0.4)/(0.4211+0.4) = 0.4103
TPR_2 = Recall = 0.4
FPR_2 = 55/(55+945) = 0.055

b)

since F1_1 > F1_2; T1 is better regarding F1 test

c)

since (TPR/FPR)1 = 0.4/0.1 = 4 and (TPR/FPR)2 = 0.4/0.055 = 7.27


since 7.27 is higher, T2 is better

d)

Recall and TPR is important since our challenge is to detect 'rare


(cancer) cases'. Also since false detection comes with a cost, balance
between Recall and Precision is also necessary, so TPR/FPR is a
good important metric
Though best choice comes out as F1 since it has focus both on
'balance between recall and precision' and 'having true positive
values'.

You might also like