Professional Documents
Culture Documents
KNN Classification Model
KNN Classification Model
KNN Classification Model
Iris Dataset
Using Library Train_Test_Split
Step 1:
Open anaconda navigator and launch Jupyter Notebook
Save the iris.csv dataset in the same folder as your coding
Create a new phyton file and start doing the coding
Step 2:
Import library pandas in the coding.
This library is used to read the CSV file from the same folder
Load the dataset into the coding
import pandas as pd
iris_data = pd.read_csv('C:/Users/ariny/OneDrive/Documents/iris.csv')
iris_data
Step 3:
Split the dataset into training and testing datasets using the function train_test_split.
This function is used to select training data and testing data randomly.
Import function train_test_split using library sklearn.model_selection
Drop species and random column
Test size is 0.3 because 70% for training and 30% for testing
Step 4:
Import library KNN from sklearn.neighbors
Use Classifier to determine false or true
Change n_neighbors according to number of nearest neighbors
Train the model using knn.fit and predict using knn.predict
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
Step 5:
Import metrics to calculate accuracy from sklearn
Calculate accuracy score for knn
Output:
Step 6:
Import classification_report and confusion_matrix from sklearn.metrics
Print classification_report to generate a report containing precision, recall, f1-score and support
and print confusion_matrix to know the model’s performance for true and false prediction of the
dataset
Output:
print(classification_report(y_test, y_pred))
Output:
Using Manually Random Ordered
Step 1:
Open anaconda navigator and launch Jupyter Notebook
In the bill_authentication.csv put =RANDBETWEEN(x,y) with x and y is 1 until the last number
Sort and split into two for training data and testing data
Save the datasets in the same folder as your coding
Step 2:
Import library pandas in the coding.
This library is used to read the CSV file from the same folder
Load the dataset into the coding
import pandas as pd
iris_train = pd.read_csv('C:/Users/ariny/OneDrive/Documents/bill_train.csv')
iris_test = pd.read_csv('C:/Users/ariny/OneDrive/Documents/bill_test.csv')
Step 3:
Divide the datasets into input and target variables
Drop class columns in input training and target training
Step 4:
Import library KNN from sklearn.neighbors
Use Classifier to determine false or true
Change n_neighbors according to number of nearest neighbors
Train the model using knn.fit and predict using knn.predict
knn.fit(input_training, target_training)
y_pred = knn.predict(input_testing)
Step 5:
Calculate accuracy using metrics.accuracy_score
Import metrics from sklearn
from sklearn import metrics
print("Accuracy:", metrics.accuracy_score(target_testing, y_pred))
Output:
Step 6:
Import confusion_metrics and classification_report from sklearn.metrics
Print confusion metrics to print out the model’s performance for true and false prediction of the
dataset
Print classification report to generate a report containing precision, recall, f-score and suport of
the model.
Output:
print(classification_report(target_testing, y_pred))
Output:
Bill Dataset
Using Library Train_Test_Split
Step 1:
Open anaconda navigator and launch Jupyter Notebook
Save the bill_authentication.csv dataset in the same folder as your coding
Create a new phyton file and start doing the coding
Step 2:
Import library pandas in the coding.
This library is used to read the CSV file from the same folder
Load the dataset into the coding
import pandas as pd
bill_data = pd.read_csv('C:/Users/ariny/OneDrive/Documents/bill_authentication.csv')
bill_data
Step 3:
Split the dataset into training and testing datasets using the function train_test_split.
This function is used to select training data and testing data randomly.
Import function train_test_split using library sklearn.model_selection
Drop class column
Test size is 0.3 because 70% for training and 30% for testing
Step 4:
Import library KNN from sklearn.neighbors
Use Classifier to determine false or true
Change n_neighbors according to number of nearest neighbors
Train the model using knn.fit and predict using knn.predict
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
Step 5:
Import metrics to calculate accuracy from sklearn
Calculate accuracy score for knn
Output:
Step 6:
Import classification_report and confusion_matrix from sklearn.metrics
Print classification_report to generate a report containing precision, recall, f1-score and support
and print confusion_matrix to know the model’s performance for true and false prediction of the
dataset
Output:
print(confusion_matrix(y_test, y_pred))
Output:
Using Manually Random Ordered
Step 1:
Open anaconda navigator and launch Jupyter Notebook
In the bill_authentication.csv put =RANDBETWEEN(x,y) with x and y is 1 until the last number
Sort and split into two for training data and testing data
Save the datasets in the same folder as your coding
Step 2:
Import library pandas in the coding.
This library is used to read the CSV file from the same folder
Load the dataset into the coding
import pandas as pd
bill_train = pd.read_csv('C:/Users/ariny/OneDrive/Documents/bill_train.csv')
bill_test = pd.read_csv('C:/Users/ariny/OneDrive/Documents/bill_test.csv')
Step 3:
Divide the datasets into input and target variables
Drop class columns in input training and target training
Step 4:
Import library KNN from sklearn.neighbors
Use Classifier to determine false or true
Change n_neighbors according to number of nearest neighbors
Train the model using knn.fit and predict using knn.predict
knn.fit(input_training, target_training)
y_pred = knn.predict(input_testing)
Step 5:
Calculate accuracy using metrics.accuracy_score
Import metrics from sklearn
from sklearn import metrics
print("Accuracy:", metrics.accuracy_score(target_testing, y_pred))
Output:
Step 6:
Import confusion_metrics and classification_report from sklearn.metrics
Print confusion metrics to print out the model’s performance for true and false prediction of the
dataset
Print classification report to generate a report containing precision, recall, f-score and suport of
the model.
Output:
print(classification_report(target_testing, y_pred))
Output: