Professional Documents
Culture Documents
Index: Name - JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem - V
Index: Name - JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem - V
Index: Name - JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem - V
Index
Date of Sign/
Sr.
Topic Page No.
No Remarks
Done Checked
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
Experiment 9
Aim – Write a python program to implement Logistic Regression in Data Science and Machine
Learning
Code –
Jinesh Prajapat
import warnings
warnings.filterwarnings('ignore')
In [3]: # normalization
x = (x_data -np.min(x_data))/(np.max(x_data)-np.min(x_data)).values
x_train = x_train.T
x_test = x_test.T
y_train = y_train.T
y_test = y_test.T
In [6]: # sigmoid
# calculation of z
#z = np.dot(w.T,x_train)+b
#y_head = sigmoid(5)
def sigmoid(z):
y_head = 1/(1+np.exp(-z))
return y_head
return Y_prediction
# predict(parameters["weight"],parameters["bias"],x_test)
In [10]: # %%
def logistic_regression(x_train, y_train, x_test, y_test, learning_rate , num_iterations):
# initialize
dimension = x_train.shape[0] # that is 4096
w,b = initialize_weights_and_bias(dimension)
# do not change learning rate
parameters, gradients, cost_list = update(w, b, x_train, y_train, learning_rate,num_itera
y_prediction_test = predict(parameters["weight"],parameters["bias"],x_test)
y_prediction_train = predict(parameters["weight"],parameters["bias"],x_train)
In [11]: # sklearn
from sklearn import linear_model
logreg = linear_model.LogisticRegression(random_state = 42,max_iter= 150)
print("test accuracy: {} ".format(logreg.fit(x_train.T, y_train.T).score(x_test.T, y_test.T))
print("train accuracy: {} ".format(logreg.fit(x_train.T, y_train.T).score(x_train.T, y_train.
In [ ]:
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
Experiment 10
Aim – : Create a Machine Learning Model using Support Vector Machine algorithm.
Code –
In [ 45]: print("Jinesh Prajapat")
Jinesh Prajapat
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import preprocessing
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 299 entries, 0 to 298
Data columns (total 13 columns):
# Column Non-Null Count Dtype
In [49]: #Evaluating the target and finding out the potential skewness in the data
cols= ["#CD5C5C","#FF0000"]
ax = sns.countplot(x= data_df["DEATH_EVENT"], palette= cols)
ax.bar_label(ax.containers[0])
# In
Doing Univariate Analysis for statistical description and understanding of dispersion of da
[50]:
data_df.describe().T
In [51]: #Doing Bivariate Analysis by examaning a corelation matrix of all the features using heatmap
cmap = sns.diverging_palette(2, 165, s=80, l=55, n=9)
corrmat = data_df.corr()
plt.subplots(figsize=(12,12))
sns.heatmap(corrmat,cmap= cmap,annot=True, square=True)
In [53]: # Checking for potential outliers using the "Boxen and Swarm plots" of non binary features.
feature = ["age","creatinine_phosphokinase","ejection_fraction","platelets","serum_creatinine
for i in feature:
plt.figure(figsize=(10,7))
sns.swarmplot(x=data_df["DEATH_EVENT"], y=data_df[i], color="black", alpha=0.7)
sns.boxenplot(x=data_df["DEATH_EVENT"], y=data_df[i], palette=cols)
plt.show()
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
In [54]: # Plotting "Kernel Density Estimation (kde plot)" of time and age features - both of which a
sns.kdeplot(x=data_df["time"], y=data_df["age"], hue =data_df["DEATH_EVENT"], palette=cols)
In [13]: # Defining independent and dependent attributes in training and test sets
X=data_df.drop(["DEATH_EVENT"],axis=1)
y=data_df["DEATH_EVENT"]
In [14]: # Setting up a standard scaler for the features and analyzing it thereafter
col_names = list(X.columns)
s_scaler = preprocessing.StandardScaler()
X_scaled= s_scaler.fit_transform(X)
X_scaled = pd.DataFrame(X_scaled, columns=col_names)
X_scaled.describe().T
-8.911489e-
sex 299.0 1.001676 -1.359272 -1.359272 0.735688 0.735688 0.735688
18
-1.188199e-
smoking 299.0 1.001676 -0.687682 -0.687682 -0.687682 1.454161 1.454161
17
-1.901118e-
time 299.0 1.001676 -1.629502 -0.739000 -0.196954 0.938759 1.997038
16
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
Out[56]: 0.7888888888888889
In [57]: # Printing classification report (since there was biasness in target labels)
print(classification_report(y_test, y_pred))
accuracy 0.79 90
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
Experiment 11
Aim – : Create a Machine Learning Model using Decision Tree algorithm.
Code –
In [0]: print("Jinesh Prajapat")
Jinesh Prajapat
(1728, 7)
Out[3]:
Out[4]: 0 2 3 4 5 6
1
0 vhigh vhigh 2 2 small low unacc
Out[6]:
In [7]: df.info()
<class 'pandas.core.frame.DataFrame'>RangeIndex:
1728 entries, 0 to 1727
Data columns (total 7 columns):
# Column Non-Null Count Dtype
out [8]:
In [9]: df['class'].value_counts()
unacc 1210
Out[9]:
acc 384
good 69
vgood 65
Name: class, dtype: int64
Out[10]: buying 0
maint 0
doors 0
persons 0
lug_boot 0
safety 0
class 0
dtype: int64
X = df.drop(['class'], axis=1)
y = df['class']
buying object
Out[14]:
maint object
doors object
persons object
lug_boot object
safety object
dtype: object
In [15]: X_train.head()
In [18]: X_train.head()
48 1 1 1 1 1 1
468 2 1 1 2 2 1
155 1 2 1 1 2 2
1721 3 3 2 1 2 2
1208 4 3 3 1 2 2
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
In [19]: X_test.head()
599 2 2 4 3 1 2
1201 4 3 3 2 1 3
628 2 2 2 3 3 3
1498 3 2 2 2 1 3
1263 4 3 4 1 1 1
Out[21]: ▾ DecisionTreeClassifier
DecisionTreeClassifier(max_depth=3, random_state=0)
In [27]: plt.figure(figsize=(8,6))
from sklearn import tree
tree.plot_tree(clf_gini.fit(X_train, y_train))
Out[27]: [Text(0.4, 0.875, 'x[5] <= 1.5\ngini = 0.455\nsamples = 1157\nvalue = [255, 49, 81
3, 40]'),
Text(0.2, 0.625, 'gini = 0.0\nsamples = 386\nvalue = [0, 0, 386, 0]'),
Text(0.6, 0.625, 'x[3] <= 2.5\ngini = 0.577\nsamples = 771\nvalue = [255, 49, 42
7, 40]'),
Text(0.4, 0.375, 'x[0] <= 2.5\ngini = 0.631\nsamples = 525\nvalue = [255, 49, 18
1, 40]'),
Text(0.2, 0.125, 'gini = 0.496\nsamples = 271\nvalue = [124, 0, 147, 0]'),
Text(0.6, 0.125, 'gini = 0.654\nsamples = 254\nvalue = [131, 49, 34, 40]'),
Text(0.8, 0.375, 'gini = 0.0\nsamples = 246\nvalue = [0, 0, 246, 0]')]
Out[28]: ▾ DecisionTreeClassifier
DecisionTreeClassifier(criterion='entropy', max_depth=3, random_state=0)
In [34]: plt.figure(figsize=(8,6))
from sklearn import tree
tree.plot_tree(clf_en.fit(X_train, y_train))
In [35]: # Print the Confusion Matrix and slice it into four pieces
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred_en)
print('Confusion matrix\n\n', cm)
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
Confusion matrix
[[ 73 0 56 0]
[ 20 0 0 0]
[ 12 0 385 0]
[ 25 0 0 0]]
In [ ]:
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
Experiment 12
Aim – : Create a Machine Learning Model using Random Forest algorithm.
Code –
In [0]: print("Jinesh Prajapat")
Jinesh Prajapat
(1728, 7)
Out[3]:
Out[4]: 0 2 3 4 5 6
1
0 vhigh vhigh 2 2 small low unacc
Out[6]:
In [7]: df.info()
<class 'pandas.core.frame.DataFrame'>RangeIndex:
1728 entries, 0 to 1727
Data columns (total 7 columns):
# Column Non-Null Count Dtype
out [8]:
In [9]: df['class'].value_counts()
unacc 1210
Out[9]:
acc 384
good 69
vgood 65
Name: class, dtype: int64
Out[10]: buying 0
maint 0
doors 0
persons 0
lug_boot 0
safety 0
class 0
dtype: int64
X = df.drop(['class'], axis=1)
y = df['class']
buying object
Out[14]:
maint object
doors object
persons object
lug_boot object
safety object
dtype: object
In [15]: X_train.head()
In [18]: X_train.head()
48 1 1 1 1 1 1
468 2 1 1 2 2 1
155 1 2 1 1 2 2
1721 3 3 2 1 2 2
1208 4 3 3 1 2 2
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
In [19]: X_test.head()
599 2 2 4 3 1 2
1201 4 3 3 2 1 3
628 2 2 2 3 3 3
1498 3 2 2 2 1 3
1263 4 3 4 1 1 1
Out[23]: ▾ RandomForestClassifier
RandomForestClassifier(random_state=0)
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
safety 0.295319
Out[24]:
persons 0.233856
buying 0.151734
maint 0.146653
lug_boot 0.100048
doors 0.072389
#dtype: float64
Creating a seaborn bar plot
In [25]: sns.barplot(x=feature_scores, y=feature_scores.index)
In [30]: # Print the Confusion Matrix and slice it into four pieces
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
print('Confusion matrix\n\n', cm)
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduction to Data Science & Machine Learning (AI 354)
In [ ]:
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduc on to Data Science & Machine Learning (AI 354)
Experiment 13
Aim – : Create a Machine Learning Model using K-means Clustering algorithm.
Code –
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduc on to Data Science & Machine Learning (AI 354)
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduc on to Data Science & Machine Learning (AI 354)
Name – JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem – V
Subject – Introduc on to Data Science & Machine Learning (AI 354)