Professional Documents
Culture Documents
Ilovepdf Merged
Ilovepdf Merged
Ilovepdf Merged
Practical – 1
Aim – Introduction to Python
• What is Python
Python is a genral purpose high level programming language.
• On GitHub :
• About Python :
Very simple and straight forward syntax.
It can be your first programming language too.
Python is case sensitive
It is an Object Oriented Language
Dynamically typed
Indentation is used in place of curly braces
IU2141220140 DATA SCIENCE
• Library for :
Graphical user interfaces
Web frameworks
Multimedia
Databases
Networking
Test frameworks
Automation
Web scraping (Like crawler)
Documentation
System administration
Scientific computing
Text processing
Image processing
IOT
• Download and Installation :
IU2141220140 DATA SCIENCE
IU2141220140 DATA SCIENCE
Practical – 2
Aim – Introduction to Google Colab.
• Google is quite aggressive in AI research. Over many years, Google developed
AI framework called TensorFlow and a development tool called Colaboratory.
Today TensorFlow is open-sourced and since 2017, Google made
Colaboratory free for public use. Colaboratory is now known as Google Colab
or simply Colab.
• Another attractive feature that Google offers to the developers is the use of
GPU. Colab supports GPU and it is totally free. The reasons for making it free
for public could be to make its software a standard in the academics for
teaching machine learning and data science. It may also have a long term
perspective of building a customer base for Google Cloud APIs which are sold
per-use basis.
• Irrespective of the reasons, the introduction of Colab has eased the learning
and development of machine learning applications.
• Create/Upload/Share notebooks
Using Guidance :
• Executing Code
To execute the code, click on the arrow on the left side of the code window.
• To add more code to your notebook, select the following menu options −
Practical – 3
Aim : Study of various Machine Learning Libraries
7
IU2141220140 Data Science
8
IU2141220140 Data Science
9
IU2141220140 Data Science
10
IU2141220140 Data Science
11
IU2141220140 Data Science
12
IU2141220140 Data Science
13
IU2141220140 Data Science
Practical – 4
Aim : Introduction to Github Repository
14
IU2141220140 Data Science
15
IU2141220140 Data Science
16
IU2141220140 Data Science
17
IU2141220140 Data Science
Practical – 5
Aim : Write a program to implemenr Linear Regression
18
IU2141220140 Data Science
19
IU2141220140 Data Science
Practical – 6
Aim : Bank Churning using ANN
20
IU2141220140 Data Science
21
IU2141220140 Data Science
22
IU2141220140 Data Science
23
IU2141220140 Data Science
24
IU2141220140 Data Science
25
IU2141220140 Data Science
Practical - 7
Aim : Binary Classification using CNN.
4. Fla ening: Once the convolu onal and pooling layers have been
applied, the resul ng feature maps are fla ened into a one-
dimensional vector. This fla ening process converts the spa al
26
IU2141220140 Data Science
informa on into a format that can be fed into a tradi onal neural
network.
27
IU2141220140 Data Science
Example :
# Any results you write to the current directory are saved as output.
Using TensorFlow backend.
['training_set', 'test_set']
28
IU2141220140 Data Science
layer_names=True) display(Image.open('cnn_model.png'))
from keras.preprocessing.image import
ImageDataGenerator train_datagen = ImageDataGenerator(rescale
= 1./255,
shear_range = 0.2, zoom_range = 0.2,
horizontal_flip = True) test_datagen =
ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory
('../input/training_set/training_set/', target_size =
(64, 64), batch_size = 32, class_mode =
'binary') test_set
=
test_datagen.flow_from_directory('../input/test_set/test_set',
target_size = (64, 64), batch_size = 32, class_mode = 'binary')
29
IU2141220140 Data Science
filepath = "best_model.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc',
verbose=1, save_ best_only=True, mode='max') history =
classifier.fit_generator(training_set,
steps_per_epoch = 8000, epochs = 15, validation_data
= test_set, validation_steps = 2000, callbacks =
[checkpoint]) print(history.history.keys())
Epoch 1/15
30
IU2141220140 Data Science
import numpy as np
from keras.preprocessing import image
test_image = image.load_img('../input/test_set/test_set/cats/cat.4009.jpg'
, target_size = (64, 64))
test_image = image.img_to_array(test_image) test_image
= np.expand_dims(test_image, axis = 0) result =
classifier.predict(test_image) print(result)
print(training_set.class_indices) if result[0][0] == 1:
prediction = 'dog' else: prediction = 'cat' print(prediction)
[[4.2898555e-30]] {'cats': 0,
'dogs': 1} cat
31
IU2141220149 Data Science
IU2141220140 Data Science
Practical – 8
AIM: Mini Project (Music Recommendation System)
Code:
import numpy as np
import pandas as pd
from typing import List, Dict
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
#This dataset contains name, artist, and lyrics for 57650 songs in English.
songs = pd.read_csv('songdata.csv')
songs.head()
songs.shape
(57650, 4)
#we are going to resample only 5000 random songs.
songs = songs.sample(n=5000).drop('link', axis=1).reset_index(drop=True)
#We can also notice the presence of \n in the text, so we are going to remove it.
songs['text'] = songs['text'].str.replace(r'\n', '')
#We now need to calculate the similarity of one lyric to another. We are going to use cosine
#similarity.
#We want to calculate the cosine similarity of each item with every other item in the
#dataset. So we just pass the lyrics_matrix as argument.
cosine_similarities = cosine_similarity(lyrics_matrix)
#Once we get the similarities, we'll store in a dictionary the names of the 50 most similar
#songs for each song in our dataset.
similarities = {}
for i in range(len(cosine_similarities)):
# Now we'll sort each element in cosine_similarities and get the indexes of the songs.
similar_indices = cosine_similarities[i].argsort()[:-50:-1]
# After that, we'll store in similarities each name of the 50 most similar songs.
# Except the first one that is the same song.
similarities[songs['song'].iloc[i]] = [(cosine_similarities[i][x], songs['song'][x],
songs['artist'][x]) for x in similar_indices][1:]
recommendation = {
"song": songs['song'].iloc[10],
"number_songs": 4
}
recommedations.recommend(recommendation)
recommendation2 = {
"song": songs['song'].iloc[120],
"number_songs": 4
}
recommedations.recommend(recommendation2)