Professional Documents
Culture Documents
Bilal Ahmad Ai & DSS Assign # 03
Bilal Ahmad Ai & DSS Assign # 03
Bilal Ahmad Ai & DSS Assign # 03
SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST,
RAWALPINDI
DEGREE 42
SYNDICATE A
DEPARTMENT Computer Engineering
2. To start, pick 100 of each randomly. Your matrix should end up as 2D, 200
rows by 256 columns. Remember to include the true label for each digit,
in another array, called labels (or whatever you want).
3. Use the KNN rule to classify each of the digits in your training set, and
report the training accuracy. Plot a graph to display the training accuracy
as you vary K from 1 to 20.
4. Now break your training set randomly into 2 equal parts, one part you
will use for training, and one part for testing. Plot a testing accuracy
graph, again varying K.
5. Again perform above task 3 by repeating the random split of data into
training and testing and then reclassify. Do you get the same behavior?
Plot the average and standard deviation as error bars of both testing.
accuracies achieved in task 3 and 4, for all values of K. Remember all
graphs should have axis labels and a title. If you do not know what
MATLAB commands to use, try Googling.
6. When you are done with all this, extend the above to load and predict
digits ‘3’, ‘6’, and ‘8’. You can visualize your classifications using the
showdata function provided.
Code:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from scipy.io import loadmat
def get_response(arr):
return np.bincount(arr).argmax()
n = 0
mistakes = 0
plt.imshow(m, cmap='gray')
plt.axis('off')
plt.title(f"{mistakes} mistakes out of {num_examples} ({(mistakes /
num_examples) * 100:.2f}%)", fontsize=16)
plt.show()
plt.pause(0.001)
input("Press Enter to close the plot...")
# Number of samples
num_samples = 300
samples = np.zeros((num_samples, 256))
labels = np.zeros(num_samples)
# Assign the samples and labels arrays to x_train and y_train, respectively
x_train = samples.astype(int)
y_train = labels.astype(int)
train_accuracies = []
ks = list(range(1, 21))
for k in ks:
predictions = [get_response(find_neighbors(k, instance, x_train, y_train))
for instance in x_train]
accuracy = accuracy_score(y_train, predictions)
train_accuracies.append(accuracy)
print(f"Training Accuracy for k={k}: {accuracy:.4f}")
num_splits = 2
testing_accuracies_all_splits = []
for k in ks:
predictions = [get_response(find_neighbors(k, instance, x_train,
y_train)) for instance in x_test]
accuracy = accuracy_score(y_test, predictions)
testing_accuracies.append(accuracy)
print(f"Testing Accuracy for k={k} (Split {split+1}): {accuracy:.4f}")
testing_accuracies_all_splits.append(testing_accuracies)
# Task 5
testing_accuracies_all_splits = np.array(testing_accuracies_all_splits)
average_accuracies = np.mean(testing_accuracies_all_splits, axis=0)
std_accuracies = np.std(testing_accuracies_all_splits, axis=0)
Output:
PLOT: