97 Ds

IU2041230097 DS
Practical-01
Aim:-Introduction to Jupyter Notebook.
Installation
you can use a handy tool that comes with Python called pip to install
Jupyter Notebook like this:
$ pip install jupyter
The next most popular distribution of Python is Anaconda.
Starting the Jupyter Notebook Server
open up your terminal application and go to a folder of your choice. go

to that location in your terminal and run the following command:
$ jupyter notebook
This will start up Jupyter and your default browser should start (or open a
new tab) to the following URL: http://localhost:8888/tree
Your browser should now look something like this:
right now you are not actually running a Notebook, but instead you are
just running the Notebook server.
1
IU2041230097 DS
Creating a Notebook
click on the New button (upper right), choose Python 3.
Your web page should now look like this:
Naming
You will notice that at the top of the page is the word Untitled. let’s
change it!
Let’s try writing the code to the running cell:
print('Hello Jupyter!'):
2
IU2041230097 DS
Practical-02
Aim:-To Implement Python Basic Programs.
❖ Python program to print "Hello Python"
1. print ('Hello Python')
Output: Hello World
❖ Python program to do arithmetical operations

❖
1. num1 = input('Enter first number: ')
2. num2 = input('Enter second number: ')
3. sum = float(num1) + float(num2)
4. min = float(num1) - float(num2)
5. mul = float(num1) * float(num2)
6. div = float(num1) / float(num2)
7. print('The sum of {0} and {1} is {2}'.format(num1, num2, sum))
8. print('The subtraction of {0} and {1} is {2}'.format(num1, num2, min))
9. print('The multiplication of {0} and {1} is {2}'.format(num1, num2, mul))
10. print('The division of {0} and {1} is {2}'.format(num1, num2, div))
Output:
Enter first number: 10
Enter second number: 20
The sum of 10 and 20 is 30.0
The subtraction of 10 and 20 is -10.0
The multiplication of 10 and 20 is 200.0
The division of 10 and 20 is 0.5
❖ Python program to find the area of a triangle

1. a = float(input('Enter first side: '))
2. b = float(input('Enter second side: '))
3. c = float(input('Enter third side: '))
4. s = (a + b + c) / 2
5. area = (s*(s-a)*(s-b)*(s-c)) ** 0.5
6. print('The area of the triangle is %0.2f' %area)
3
IU2041230097 DS
Output:
❖ Python program to solve quadratic equation
1. import cmath
2. a = float(input('Enter a: '))
3. b = float(input('Enter b: '))
4. c = float(input('Enter c: '))
5. d = (b**2) - (4*a*c)
6. sol1 = (-b-cmath.sqrt(d))/(2*a)
7. sol2 = (-b+cmath.sqrt(d))/(2*a)
8. print('The solution are {0} and {1}'.format(sol1,sol2))
Output:
Enter a: 8
Enter b: 5
Enter c: 9
The solution are (-0.3125-1.0135796712641785j) and (-0.3125+1.01357967126
❖ Python program to swap two variables
1. P = int( input("Please enter value for P: "))

2. Q = int( input("Please enter value for Q: "))
3. temp_1 = P
4. P=Q
5. Q = temp_1
6. print ("The Value of P after swapping: ", P)
7. print ("The Value of Q after swapping: ", Q)
Output:
Please enter value for P: 13
4
IU2041230097 DS
Please enter value for Q: 43

The Value of P after swapping: 43
The Value of Q after swapping: 13
❖ Python program to generate a random number
1. import random
2. n = random.random()
3. print(n)
Output:
0.7632870997556201
If we run the code again, we will get the different output as follows.
0.8053503984689108
Generating a Number within a Given Range
1. import random
2. n = random.randint(0,50)
3. print(n)
Output:
40
❖ Python program to display calendar
1. import calendar
2. yy = int(input("Enter year: "))
3. mm = int(input("Enter month: "))
4. print(calendar.month(yy,mm))
Output:
Enter year: 2022

Enter month: 6
June 2022
Mo Tu We Th Fr Sa Su
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30
5
IU2041230097 DS
Practical-03
Aim:-Study of various Machine Learning libraries.
>Python libraries that are used in Machine Learning are:
1.Numpy: NumPy is a very popular python library for large multi-dimensional array
and matrix processing, with the help of a large collection of high-level mathematical
functions. It is very useful for fundamental scientific computations in Machine
Learning. It is particularly useful for linear algebra, Fourier transform, and random
number capabilities. High-end libraries like TensorFlow uses NumPy internally for
manipulation of Tensors.
import numpy as np
x = np.array([[1, 2], [3, 4]])

y = np.array([[5, 6], [7, 8]])
v = np.array([9, 10])
w = np.array([11, 12])
print(np.dot(v, w), "\n")
print(np.dot(x, v), "\n")
print(np.dot(x, y))
Output:
219
[29 67]
[[19 22]
[43 50]]
2.Pandas: Pandas is a popular Python library for data analysis. It is not directly related
to Machine Learning. As we know that the dataset must be prepared before training. In
this case, Pandas comes handy as it was developed specifically for data extraction and
preparation. It provides high-level data structures and wide variety tools for data
analysis. It provides many inbuilt methods for grouping, combining and filtering data.
import pandas as pd
data = {"country": ["Brazil", "Russia", "India", "China", "South Africa"],

"capital": ["Brasilia", "Moscow", "New Delhi", "Beijing", "Pretoria"],
6
IU2041230097 DS
"area": [8.516, 17.10, 3.286, 9.597, 1.221],

"population": [200.4, 143.5, 1252, 1357, 52.98] }
data_table = pd.DataFrame(data)
print(data_table)
Output:
3.Matplotlib: Matplotlib is a very popular Python library for data visualization. Like
Pandas, it is not directly related to Machine Learning. It particularly comes in handy
when a programmer wants to visualize the patterns in the data. It is a 2D plotting library
used for creating 2D graphs and plots. A module named pyplot makes it easy for
programmers for plotting as it provides features to control line styles, font properties,
formatting axes, etc. It provides various kinds of graphs and plots for data visualization,
viz., histogram, error charts, bar chats, etc,
import matplotlib.pyplot as plt

import numpy as np
x = np.linspace(0, 10, 100)
plt.plot(x, x, label ='linear')
plt.legend()
plt.show()
Output:
7
IU2041230097 DS
4.TensorFlow: TensorFlow is a very popular open-source library for high performance

numerical computation developed by the Google Brain team in Google. As the name
suggests, Tensorflow is a framework that involves defining and running computations
involving tensors. It can train and run deep neural networks that can be used to develop
several AI applications. TensorFlow is widely used in the field of deep learning
research and application.
import tensorflow as tf
x1 = tf.constant([1, 2, 3, 4])
x2 = tf.constant([5, 6, 7, 8])
result = tf.multiply(x1, x2)
sess = tf.Session()
print(sess.run(result))
sess.close()
Output:
[ 5 12 21 32]
5.Keras: It provides many inbuilt methods for groping, combining and filtering data.
Keras is a very popular Machine Learning library for Python. It is a high-level neural
networks API capable of running on top of TensorFlow, CNTK, or Theano. It can run
seamlessly on both CPU and GPU. Keras makes it really for ML beginners to build and
design a Neural Network. One of the best thing about Keras is that it allows for easy
and fast prototyping.
8
IU2041230097 DS
6.PyTorch: PyTorch is a popular open-source Machine Learning library for Python

based on Torch, which is an open-source Machine Learning library that is implemented
in C with a wrapper in Lua. It has an extensive choice of tools and libraries that support
Computer Vision, Natural Language Processing(NLP), and many more ML programs.
It allows developers to perform computations on Tensors with GPU acceleration and
also helps in creating computational graphs.
import torch
dtype = torch.float
device = torch.device("cpu")
N, D_in, H, D_out = 64, 1000, 100, 10
x = torch.random(N, D_in, device=device, dtype=dtype)

y = torch.random(N, D_out, device=device, dtype=dtype)
w1 = torch.random(D_in, H, device=device, dtype=dtype)

w2 = torch.random(H, D_out, device=device, dtype=dtype)
learning_rate = 1e-6
for t in range(500):
h = x.mm(w1)
h_relu = h.clamp(min=0)
y_pred = h_relu.mm(w2)
loss = (y_pred - y).pow(2).sum().item()

print(t, loss)
grad_y_pred = 2.0 * (y_pred - y)

grad_w2 = h_relu.t().mm(grad_y_pred)
grad_h_relu = grad_y_pred.mm(w2.t())
grad_h = grad_h_relu.clone()
grad_h[h < 0] = 0
grad_w1 = x.t().mm(grad_h)
w1 -= learning_rate * grad_w1
w2 -= learning_rate * grad_w2
Output:
0 47168344.0
1 46385584.0
2 43153576.0
...
...
...
497 3.987660602433607e-05
9
IU2041230097 DS
498 3.945609932998195e-05
499 3.897604619851336e-05
7.SciPy: SciPy is a very popular library among Machine Learning enthusiasts as it

contains different modules for optimization, linear algebra, integration and statistics.
There is a difference between the SciPy library and the SciPy stack. The SciPy is one
of the core packages that make up the SciPy stack. SciPy is also very useful for image
manipulation.
from scipy.misc import imread, imsave, imresize
img = imread('D:/Programs / cat.jpg') # path of the image

print(img.dtype, img.shape)
img_tint = img * [1, 0.45, 0.3]

imsave('D:/Programs / cat_tinted.jpg', img_tint)
img_tint_resize = imresize(img_tint, (300, 300))
imsave('D:/Programs / cat_tinted_resized.jpg', img_tint_resize)
If scipy.misc import imread, imsave,imresize does not work on your operating system
then try below code instead to proceed with above code
!pip install imageio
import imageio
from imageio import imread, imsave
Original image:
Tinted image:
10
IU2041230097 DS
Resized tinted image:
8.Scikit-learn:Scikit-learn is one of the most popular ML libraries for classical ML

algorithms. It is built on top of two basic Python libraries, viz., NumPy and SciPy.
Scikit-learn supports most of the supervised and unsupervised learning algorithms.
Scikit-learn can also be used for data-mining and data-analysis, which makes it a great
tool who is starting out with ML.
from sklearn import datasets

from sklearn import metrics
from sklearn.tree import DecisionTreeClassifier
dataset = datasets.load_iris()
model = DecisionTreeClassifier()
model.fit(dataset.data, dataset.target)
print(model)
expected = dataset.target
predicted = model.predict(dataset.data)
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))
Output:
DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,

max_features=None, max_leaf_nodes=None,
min_impurity_decrease=0.0,
min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, presort=False,
random_state=None, splitter='best')
precision recall f1-score support
0 1.00 1.00 1.00 50
11
IU2041230097 DS
1 1.00 1.00 1.00 50

2 1.00 1.00 1.00 50
micro avg 1.00 1.00 1.00 150

macro avg 1.00 1.00 1.00 150
weighted avg 1.00 1.00 1.00 150
[[50 0 0]
[ 0 50 0]
[ 0 0 50]]
9.Theano: We all know that Machine Learning is basically mathematics and statistics.
Theano is a popular python library that is used to define, evaluate and optimize
mathematical expressions involving multi-dimensional arrays in an efficient manner. It
is achieved by optimizing the utilization of CPU and GPU. It is extensively used for
unit-testing and self-verification to detect and diagnose different types of errors. Theano
is a very powerful library that has been used in large-scale computationally intensive
scientific projects for a long time but is simple and approachable enough to be used by
individuals for their own projects.
import theano
import theano.tensor as T
x = T.dmatrix('x')
s = 1 / (1 + T.exp(-x))
logistic = theano.function([x], s)
logistic([[0, 1], [-1, -2]])
Output:
array([[0.5, 0.73105858],
[0.26894142, 0.11920292]])
12
IU2041230097 DS
Practical-04
Aim:-Introduction to GitHub Repository.
What GIT is about?
Git is a free and open-source distributed version control system designed to

handle everything from small to very large projects with speed and efficiency.
Git relies on the basis of distributed development of software where more than
one developer may have access to the source code of a specific application and
can modify changes to it that may be seen by other developers.
Initially designed and developed by Linus Torvalds for Linux kernel

development in 2005.
Every git working directory is a full-fledged repository with complete history

and full version tracking capabilities, independent of network access or a central
server.
Git allows a team of people to work together, all using the same files. And it
helps the team cope with the confusion that tends to happen when multiple
people are editing the same files.
How does GIT work?
A Git repository is a key-value object store where all objects are indexed by their
SHA-1 hash value.
All commits, files, tags, and filesystem tree nodes are different types of objects
living in this repository.
A Git repository is a large hash table with no provision made for hash collisions.
Git specifically works by taking “snapshots” of files.
● Let’s us see how to host to a local repository to Github, from very

beginning(creating a github account).
A. Creating a GitHub Account
13
IU2041230097 DS
Step 1: Go to github.com and enter the required user credentials asked on the site
and then click on the SignUp for GitHub button.
Step 2: Choose a plan that best suits you. The following plans are available as
shown in below media as depicted:
Step 3: Then Click on Finish Sign Up.
The account has been created. The user is automatically redirected to your
Dashboard.
14
IU2041230097 DS
B. Creating a new Repository

● Login to your Github account
● On the dashboard click on the Green Button starting New repository.
● Make sure to verify the Github account by going into the mail which was
provided when creating the account.
● Once verification has been done, the following screen comes
C. Start by giving a repository name, description(optional) and select the

visibility and accessibility mode for the repository
D. Click on Create repository
E. The repository (in this case ITE-304 is the repository) is now created. The
repository can be created looks like:
15
IU2041230097 DS
And here you go…
16
IU2041230097 DS
Practical-05
Aim:-Download the data set and perform the analysis.
CODE:-
from google.colab import files

file = files.upload()
import pandas as pd
df = pd.read_csv('StudentsPerformance.csv')
df.head()
# Show last 5 rows in a DataFrame

df.tail()
# Show last n rows in a DataFrame

n = 10
df.tail(n)
17
IU2041230097 DS
# Getting access to the shape attribute

df.shape
(1000,8)
# Getting access to the index attribute

df.index
RangeIndex(start=0, stop=1000, step=1)
# Getting access to the column attribute

df.loc[:,"gender"]
# df.iloc[:, 5]
# Data types of each column

df.dtypes
18
IU2041230097 DS
df.info()
print(f"Count : {df.count()}")
print(f"Mean : {df.mean()}")
print(f"SD : {df.std()}")
print(f"Max : {df.max()}")
print(f"Min : {df.min()}")
19
IU2041230097 DS
df.count()
df['math score'].idxmax()
149
df['math score'].idxmin()
59
df.round()
20
IU2041230097 DS
df['math score']
df.loc[: , ["gender","math score"]]
df.loc[: , ["gender","math score"]].dtypes
21
IU2041230097 DS
import numpy as np
df['Language Score'] = np.random.randint(100,size = (1000))
df
df["Average Score"]=(df["math score"].mean()+df["reading

score"].mean()+df["writing score"].mean())/3
df.head()
df['math score'].sort_values(ascending = True)
# Sort the MathScore in decending order

df['math score'].sort_values(ascending = False)
22
IU2041230097 DS
23
IU2041230097 DS
Practical-06
Aim:-Write a program to implement Linear Regression.
CODE:-
import numpy as np
def estimate_coef(x, y):

# number of observations/points
n = np.size(x)
# mean of x and y vector

m_x = np.mean(x)
m_y = np.mean(y)
# calculating cross-deviation and deviation about x

SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x
# calculating regression coefficients

b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x
return (b_0, b_1)
def plot_regression_line(x, y, b):

# plotting the actual points as scatter plot
plt.scatter(x, y, color = "m",
marker = "o", s = 30)
# predicted response vector

y_pred = b[0] + b[1]*x
# plotting the regression line

plt.plot(x, y_pred, color = "g")
# putting labels
plt.xlabel('x')
plt.ylabel('y')
# function to show plot

plt.show()
24
IU2041230097 DS
def main():
# observations / data
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])
# estimating coefficients
b = estimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {} \
\nb_1 = {}".format(b[0], b[1]))
# plotting regression line

plot_regression_line(x, y, b)
if __name__ == "__main__":
main()
OUTPUT:-
25
IU2041230097 DS
Practical-07
Aim:-Write a program to implement K-Nearest Neighbors.
CODE:
# Import necessary modules

from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
import numpy as np
irisData = load_iris()
# Create feature and target arrays

X = irisData.data
y = irisData.target
# Split into training and test set

X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size = 0.2, random_state=42)
neighbors = np.arange(1, 9)
train_accuracy = np.empty(len(neighbors))
test_accuracy = np.empty(len(neighbors))
# Loop over K values

for i, k in enumerate(neighbors):
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)
# Compute training and test data accuracy

train_accuracy[i] = knn.score(X_train, y_train)
test_accuracy[i] = knn.score(X_test, y_test)
# Generate plot
plt.plot(neighbors, test_accuracy, label = 'Testing dataset Accuracy')
plt.plot(neighbors, train_accuracy, label = 'Training dataset Accuracy')
26
IU2041230097 DS
plt.legend()
plt.xlabel('n_neighbors')
plt.ylabel('Accuracy')
plt.show()
OUTPUT:
27
IU2041230097 DS
Practical-08
Aim:-Write a program for Automatic grouping of similar objects
into sets.
CODE:-
from itertools import groupby

test_list = [ADITYA', 'coder_2', 'KIRTAN', 'coder_3', 'pro_3']
test_list.sort()
print ("The original list is : " + str(test_list))
res = [list(i) for j, i in groupby(test_list,
lambda a: a.split('_')[0])]
print ("The grouped list is : " + str(res))
from itertools import groupby

test_list = [' ADITYA ', 'coder_2', ' KIRTAN ', 'coder_3', 'pro_3']
test_list.sort()
print ("The original list is : " + str(test_list))
res = [list(i) for j, i in groupby(test_list,
lambda a: a.partition('_')[0])]
28
IU2041230097 DS
test_list = ['geek_1', 'coder_2', 'geek_4', 'coder_3', 'pro_3']

print("The original List is : "+ str(test_list))
x=[]
for i in test_list:
x.append(i[:i.index("_")])
x=list(set(x))
res=[]
for i in x:
a=[]
for j in test_list:
if(j.find(i)!=-1):
a.append(j)
res.append(a)
# printing result

print("The original list is : " + str(test_list))
res = [[item for item in test_list if item.startswith(prefix)] for prefix in
set([item[:item.index("_")] for item in test_list])]
print("The grouped list is : " + str(res))
29
IU2041230097 DS

grouped = {}
for s in test_list:
prefix = s.split('_')[0]
if prefix not in grouped:
grouped[prefix] = []
grouped[prefix].append(s)
res = list(grouped.values())
print(res)

d = {}
for s in test_list:
key = s.split('_')[0]
if key in d:
d[key].append(s)
else:
d[key] = [s]
res = list(d.values())
print("The original list is : " + str(test_list))
print("The grouped list is : " + str(res))
30

97 Ds

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

97 Ds

Uploaded by

Copyright:

Available Formats

IU2041230097 DS

The next most popular distribution of Python is Anaconda.

Starting the Jupyter Notebook Server

open up your terminal application and go to a folder of your choice. go

Your browser should now look something like this:

Your web page should now look like this:

Let’s try writing the code to the running cell:

❖ Python program to print "Hello Python"

1. print ('Hello Python')

Output: Hello World

❖ Python program to do arithmetical operations

❖ Python program to find the area of a triangle

❖ Python program to solve quadratic equation

❖ Python program to swap two variables

1. P = int( input("Please enter value for P: "))

Please enter value for Q: 43

❖ Python program to generate a random number

Generating a Number within a Given Range

Enter year: 2022

>Python libraries that are used in Machine Learning are:

x = np.array([[1, 2], [3, 4]])

print(np.dot(v, w), "\n")

print(np.dot(x, v), "\n")

data = {"country": ["Brazil", "Russia", "India", "China", "South Africa"],

"area": [8.516, 17.10, 3.286, 9.597, 1.221],

import matplotlib.pyplot as plt

plt.plot(x, x, label ='linear')

4.TensorFlow: TensorFlow is a very popular open-source library for high performance

result = tf.multiply(x1, x2)

6.PyTorch: PyTorch is a popular open-source Machine Learning library for Python

x = torch.random(N, D_in, device=device, dtype=dtype)

w1 = torch.random(D_in, H, device=device, dtype=dtype)

loss = (y_pred - y).pow(2).sum().item()

grad_y_pred = 2.0 * (y_pred - y)

7.SciPy: SciPy is a very popular library among Machine Learning enthusiasts as it

from scipy.misc import imread, imsave, imresize

img = imread('D:/Programs / cat.jpg') # path of the image

img_tint = img * [1, 0.45, 0.3]

Resized tinted image:

8.Scikit-learn:Scikit-learn is one of the most popular ML libraries for classical ML

from sklearn import datasets

DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,

precision recall f1-score support

0 1.00 1.00 1.00 50

1 1.00 1.00 1.00 50

micro avg 1.00 1.00 1.00 150

What GIT is about?

Git is a free and open-source distributed version control system designed to

Initially designed and developed by Linus Torvalds for Linux kernel

Every git working directory is a full-fledged repository with complete history

How does GIT work?

Git specifically works by taking “snapshots” of files.

● Let’s us see how to host to a local repository to Github, from very

A. Creating a GitHub Account

Step 3: Then Click on Finish Sign Up.

B. Creating a new Repository

C. Start by giving a repository name, description(optional) and select the

D. Click on Create repository

And here you go…

from google.colab import files

# Show last 5 rows in a DataFrame

# Show last n rows in a DataFrame