Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Lab 8 Least Square Multiclass Classification

We again will look at the admission data (Admission_Predic_classification.csv in the lab folder). Here the
last column is labeled “Admission_status”. The admission status of a student can have three possible
values 1 (admission offered), 2 (placed in waiting list), or 3 (application rejected). Your task is to create a
least square classification model that uses linear regression to classify each data point.

You should use the features: GRE_Score (x1), TOEFL_Score (x2), University Rating (x3), SOP score (x4), LOR
score (x5), CGPA (x6), and Research experience (x7) to predict the correct class. (you are not to use
chance of admission for this problem).

The steps are as follows:

1. Randomly shuffle the data and take the first 400 data points to learn the parameters (thetas) of
your classifier as specified in step 2 and 3. The remaining 100 datapoints will be your test
dataset.
2. For k = 1, 2 and 3, create 3 different linear functions 𝑓̃𝑘 (𝑥) = 𝑦 = 𝜃0 + 𝜃1 𝑥1 + ⋯ + 𝜃7 𝑥7 to
distinguish class k from not k.
a. To do so, for each data point 𝑥 (𝑖) that belongs to class k, set the target variable 𝑦 (𝑖) to
+1 and for datapoints that does not belong to class k, set it to -1.
b. Then find the least square estimates of the thetas.
3. The final multiclass classifier is then defined as 𝑓̂(𝑥) = 𝑎𝑟𝑔𝑚𝑎𝑥𝑘 𝑓̃𝑘 (𝑥), that is for datapoint x
the classifier will return the value of k for which 𝑓̃𝑘 (𝑥) is maximum. (for example, if 𝑓̃1 (𝑥) =
−3, 𝑓̃2 (𝑥) = +2, 𝑎𝑛𝑑 𝑓̃3 (𝑥) = 1.2 then the datapoint will be classified as belonging to class 2.
4. Apply your final classifier on the test dataset. Output the confusion matrix and report the overall
classification accuracy.

You might also like