Professional Documents
Culture Documents
AI Lab Manual For V 5SEM PDF
AI Lab Manual For V 5SEM PDF
COURSEOUTCOMES:
b. b. A* algorithm.
c.
Study of Prolog, its facts, and rules 26
35
program to train and validate the following classifiers for given data
(scikit-learn):
4 4
a. Decision Tree.
Program 1.a
AIM: Breadth-First Search (BFS) is an algorithm used for traversing or searching tree or graph data
structures. It starts at the tree root (or some arbitrary node a graph) and explores all of the neighbor nodes at
the present depth prior to moving on to nodes at the next depth level.
CODE:
from collections import deque
graph = {
'D': ['B'],
while queue:
visited.add(vertex)
queue.append(neighbor)
bfs(graph, 'A')
1
Out Put:
A
B
C
D
E
F
Program 1.b
AIM :
The DFS is an algorithm used to traverse the target node in a graph or tree data structure. The
depth-first search derives from the word "depth". Its priorities depth and searches along one
branch, as far as it can go until the end of the branch. In Python, we can easily implement it using
the recursion and other data structures like dictionaries and sets.
CODE
# Define a weighted graph as an adjacency list with weights
graph = {
'A': [('B', 1), ('C', 2)],
'B': [('A', 1), ('D', 3), ('E', 4)],
'C': [('A', 2), ('F', 5)],
'D': [('B', 3)],
'E': [('B', 4), ('F', 6)],
'F': [('C', 5), ('E', 6)]
}
def dfs_weighted(graph, start, visited=None, total_weight=0):
if visited is None:
visited = set()
visited.add(start)
print(f"Visiting {start}, Total Weight: {total_weight}")
for neighbor, weight in graph[start]:
if neighbor not in visited:
dfs_weighted(graph, neighbor, visited, total_weight + weight)
# Start DFS from vertex 'A' with a total weight of 0
dfs_weighted(graph, 'A', total_weight=0)
Out put
Visiting A, Total Weight: 0
Visiting B, Total Weight: 1
Visiting D, Total Weight: 4
Visiting E, Total Weight: 5
Visiting F, Total Weight: 11
Visiting C, Total Weight: 16
2
Informed Search strategies
AIM: Greedy Best-First Search is an AI search algorithm that attempts to find the most promising path from
a given starting point to a goal. The algorithm works by evaluating the cost of each possible path and then
expanding the path with the lowest cost.
ALGORITHM:
Let ‘OPEN’ be a priority queue containing initial state
LOOP
If OPEN is empty return failure
Node Remove-First (OPEN)
If Node is a Goal
Then return the path from initial to Goal
Else generate all successors of Node and
Put the newly generated Node into OPEN according to their f values.
END loop
CODE:
3
EXPECTED OUTPUT:
Path from A to G is: A -> C -> E -> G
4
b.A* algorithm.
It is a searching algorithm that is used to find the shortest path between an initial and a final
point. It is a handy algorithm that is often used for map traversal to find the shortest path to
be taken. A* was initially designed as a graph traversal problem, to help build a robot that
can find its own course.
CODE:
import heapq
def astar_search(graph, start, goal, heuristic):
open_list = [(0 + heuristic[start], 0, start)]
came_from = {}
g_score = {node: float('inf') for node in graph}
g_score[start] = 0
while open_list:
, current_cost, current_node = heapq.heappop(open_list)
if current_node == goal:
path = reconstruct_path(came_from, current_node)
return path, current_cost
for neighbor, cost in graph[current_node].items():
tentative_g_score = g_score[current_node] + cost
if tentative_g_score < g_score[neighbor]:
g_score[neighbor] = tentative_g_score
f_score = g_score[neighbor] + heuristic[neighbor]
heapq.heappush(open_list, (f_score, tentative_g_score, neighbor))
came_from[neighbor] = current_node
return None, float('inf') # Return None and infinity if no path is found
def reconstruct_path(came_from, current_node):
path = [current_node]
while current_node in came_from:
current_node = came_from[current_node]
path.insert(0, current_node)
return path
# Example usage:
# Define your graph as an adjacency list
graph = {
'A': {'B': 5, 'C': 3},
'B': {'D': 8, 'E': 6},
5
'C': {'E': 2, 'F': 4},
'D': {'G': 9},
'E': {'G': 7},
'F': {},
'G': {}
}
# Define your heuristic values for each node
heuristic = {
'A': 10,
'B': 8,
'C': 7,
'D': 6,
'E': 4,
'F': 3,
'G': 0
}
start_node = 'A'
goal_node = 'G'
path, total_cost = astar_search(graph, start_node, goal_node, heuristic)
if path:
print(f'Path from {start_node} to {goal_node}: {" -> ".join(path)}')
print(f'Lowest total cost: {total_cost}')
else:
print(f'No path from {start_node} to {goal_node} found.').
Introduction to Prolog:
Prolog (Programming in Logic) is a programming language particularly well-suited for
symbolic reasoning and manipulation. It is based on formal logic and is widely used in
artificial intelligence and natural language processing. In Prolog, you define relations and
rules, and the system can then query these relations to derive new information.
Basics of Prolog
FACTS: In Prolog, you declare facts about relationships and properties. Facts are statements
that are true.
Example: raining.
6
QUERIES: Prolog can be queried to find relationships based on the defined facts and rules.
Example:?- raining.
?- is the Prolog prompt. To this query, Prolog will answer yes. Raining is true because (from
above) Prolog matches it in its database of facts.
Rules: You can define rules that express relationships based on facts.
Example: relation(<argument1>,<argument2>,....,<argumentN> ).
Relation names must begin with a lowercase letter
likes(john,mary).
Note: Facts have some simple rules of syntax. Facts should always begin with a lowercase
letter and end with a full stop. The facts themselves can consist of any letter or number
combination, as well as the underscore _ character. However, names containing the
characters -,+,*,/, or other mathematical operators should be avoided.
7
?- eats(fred,apples). /* does fred eat apples */
facts
sing_a_song(ananya).
listens_to_music(rohit).
listens_to_music(ananya) :- sing_a_song(ananya).
happy(ananya) :- sing_a_song(ananya).
happy(rohit) :- listens_to_music(rohit).
playes_guitar(rohit) :- listens_to_music(rohit).
Queries
| ?- happy(rohit).
| ?- sing_a_song(rohit).
| ?- sing_a_song(ananya).
| ?- playes_guitar(rohit).
| ?- playes_guitar(ananya).
| ?- listens_to_music(ananya).
2) FACTs
woman(mia).
woman(jody).
woman(yolanda).
playsAirGuitar(jody).
8
Queries
?- woman(mia).
?- playsAirGuitar(jody).
?- playsAirGuitar(mia).
?- playsAirGuitar(vincent).
DESCRIPTION: In Prolog, you can represent a family tree using predicates to define
relationships.
In this example:
CODE:
% Facts
male(john).
male(bob).
male(jim).
female(jane).
female(susan).
female(emily).
9
parent(john, bob).
parent(john, jim).
parent(jane, bob).
parent(jane, jim).
parent(bob, jack).
parent(susan, jack).
parent(jim, emily).
% Rules
% Queries
?- father(john, bob).
% true
?- mother(jane, jim).
% ----
?- sibling(bob, jim).
% ----
?- grandparent(john, jack).
% ----
?- grandparent(susan, emily).
% ----
10
Experiment 4
AIM: Write a program to train and validate the following classifiers on given dataset(Scikit-
learn).
a) Decision Tree
The aim of a decision tree classifier is to make predictions or decisions based on a set of input
features. Decision trees are a popular machine learning algorithm used for both classification and
regression tasks. Here are the primary goals or aims of a decision tree classifier.
DESCRIPTION:
11
12
EXPECTED OUTPUT:
13
Text processing using NLTK .
a. Remove stop words : In natural language processing (NLP), stop words refer to common
words that are often removed from text data during the preprocessing stage. These words are
considered to be of little value in terms of information retrieval or text analysis because they
are frequently used in a language and don't typically contribute much to the meaning of a
sentence. Examples of stop words in English include "the," "and," "is," "in," and so
on.Removing stop words in NLP involves filtering out these common words from a given
text to focus on the more meaningful words.
CODE :
import nltk
nltk.download('punkt') # Download necessary resources
REMOVING STOPWORDS LIKE is,an,a
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
nltk.download('stopwords')
nltk.download('punkt')
def remove_stop_words(text):
# Tokenize the text
words = word_tokenize(text)
# Get the list of English stop words
stop_words = set(stopwords.words('english'))
# Remove stop words
filtered_words = [word for word in words if word.lower() not in stop_words]
# Join the filtered words back into a sentence
filtered_text = ' '.join(filtered_words)
return filtered_text
# Example usage
input_text = "This is an example sentence with some stop words."
output_text = remove_stop_words(input_text)
print("Original text:", input_text)
print("Text without stop words:", output_text)
14
EXPECTED OUT PUT
Original Words:
['programming', 'programmed', '.']
Stemmed Words:
['program', 'program', '.']
-(: ->-
b. Implement stemming
import nltk
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer
nltk.download('punkt')
# Example text
text = "programming programmed."
# Tokenize the text into words
words = word_tokenize(text)
# Create a Porter stemmer
stemmer = PorterStemmer()
# Apply stemming to each word
stemmed_words = [stemmer.stem(word) for word in words]
# Print the results
print("Original Words:")
print(words)
print("\nStemmed Words:")
print(stemmed_words)
15
POS (Parts of Speech) tagging
import nltk
# Download the missing resource
nltk.download('averaged_perceptron_tagger')
# Now, you should be able to use the pos_tag function without errors
from nltk import pos_tag
from nltk.tokenize import word_tokenize
# Example text
text = "the cat sat on the mat."
# Tokenize the text into words
words = word_tokenize(text)
# Part-of-speech tagging
tagged_words = pos_tag(words)
print("Part-of-Speech Tags:", tagged_words)
i. Decision Tree
ii. K-Nearest Neighbour
iii. Naïve Bayes
iv. Support Vector Machine
Linear Regression
16
studies and the percentage of marks that student scores in an
exam. We want to find out that given the number of hours a
student prepares for a test, about how high of a score can the
student achieve? If we plot the independent variable (hours)
on the x-axis and dependent variable (percentage) on the y-
axis, linear regression gives us a straight line that best fits the
data points, as shown in the figure below.
y = mx +b
Program
import numpy as np
import
matplo
tlib.py
plot as
plt def
estimat
e_coef(
x, y):
# number
of
observati
ons/point
s n =
np.size(x)
#
m
ea
n
of
x
a
n
d
y
v
ec
to
r
17
m
_
x
=
n
p.
m
ea
n(
x)
m_y = np.mean(y)
18
# calculating cross-deviation and deviation about x
SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x
# calculating regression coefficients
b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x
return (b_0, b_1)
def plot_regression_line(x, y, b):
19
# putting labels
plt.xlabel('x')
plt.ylabel('y')
def main():
# observations / data
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
# estimating coefficients
b = estimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {} \
main()
Output
Estimated coefficients:
b_0 = -0.0586206896552
b_1 = 1.45747126437
20
Logistic Regression
PROGRAM
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
dataset =
pd.read_csv("User_Data.csv") # input
x = dataset.iloc[:, [2, 3]].values
# output
y = dataset.iloc[:, 4].values
Output:
[[ 0.58164944 -0.88670699]
[-0.60673761 1.46173768]
[-0.01254409 -0.5677824 ]
[-0.60673761 1.89663484]
[ 1.37390747 -1.40858358]
[ 1.47293972 0.99784738]
[ 0.08648817 -0.79972756]
[-0.01254409 -0.24885782]
[-0.21060859 -0.5677824 ]
[-0.21060859 -0.19087153]]
21
classifier = LogisticRegression(random_state = 0)
classifier.fit(xtrain, ytrain)
22
y_pred = classifier.predict(xtest)
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(ytest, y_pred)
print ("Confusion Matrix : \n", cm)
Output:
Confusion Matrix :
[[65 3]
[ 8 24]]
Output:
Accuracy : 0.89
from matplotlib.colors import
ListedColormap X_set, y_set = xtest, ytest
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1,
stop = X_set[:, 0].max() + 1, step = 0.01),
np.arange(start = X_set[:, 1].min() - 1,
stop = X_set[:, 1].max() + 1, step = 0.01))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
c = ListedColormap(('red', 'green'))(i), label = j)
23
OUTPUT:
24
Decision Tree
Algorithms
Information Gain
Gain= Entropy(S)-I(Attribute)
1. If all examples are positive, Return the single-node tree ,with label=+
2. If all examples are Negative, Return the single-node tree,with label= -
25
3. If Attribute empty, Return the single-node tree
1. The attribute that has the most information gain has to create a group of all the
its attributes and process them in same as which we have done for the parent
(Root) node.
2. Again, the feature which has maximum information gain will become a node and
this process will continue until we get the leaf node.
PROGRAM:
import numpy as
np import pandas
as pd
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import
train_test_split from sklearn.tree import
DecisionTreeClassifier from sklearn.metrics import
accuracy_score
# Function importing Dataset
def importdata():
balance_data = pd.read_csv(
'https://archive.ics.uci.edu/ml/machine-learning-'+
'databases/balance-scale/balance-scale.data',
sep= ',', header = None)
print ("Dataset:
",balance_data.head()) return
balance_data
# Function to split the dataset
def splitdataset(balance_data):
X = balance_data.values[:, 1:5]
Y = balance_data.values[:, 0]
# Performing training
clf_gini.fit(X_train, y_train)
return clf_gini
27
28
clf_entropy.fit(X_train, y_train)
return clf_entropy
# Function to make predictions
def prediction(X_test,
clf_object): # Predicton on test
with giniIndex
y_pred =
clf_object.predict(X_test)
print("Predicted values:")
print(y_pred)
# Function to calculate accuracy
print("Report : ",
classification_report(y_test, y_pred))
# Driver code
def main():
# Building Phase
data = importdata()
X, Y, X_train, X_test, y_train, y_test = splitdataset(data)
clf_gini = train_using_gini(X_train, X_test, y_train)
clf_entropy = tarin_using_entropy(X_train, X_test, y_train)
# Operational Phase
print("Results Using Gini Index:")
29
30
Output:
Results Using Entropy: Predicted values: ['R' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'R' 'R' 'R' 'L'
'L' 'R' 'L' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'L' 'L' 'L' 'L' 'R' 'L'
'R' 'L' 'R' 'L' 'R' 'R' 'L' 'L' 'R' 'L' 'L' 'R' 'L' 'L' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'R' 'L' 'L' 'R' 'L' 'L'
'R' 'L' 'L' 'L' 'R' 'R' 'L' 'R' 'L' 'R' 'R' 'R' 'L' 'R' 'L' 'L' 'L' 'L' 'R' 'R' 'L' 'R' 'L' 'R' 'R' 'L' 'L'
'L' 'R' 'R' 'L' 'L' 'L' 'R' 'L' 'L' 'R' 'R' 'R' 'R' 'R' 'R' 'L' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'L'
'R' 'R' 'R' 'L' 'L' 'L' 'L' 'L' 'R' 'R' 'R' 'R' 'L' 'R' 'R' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'R' 'L'
'L' 'R' 'L' 'R' 'R' 'R' 'R' 'R' 'L' 'R' 'R' 'R' 'R' 'R' 'R' 'L' 'R' 'L' 'R' 'R' 'L' 'R' 'L' 'R' 'L' 'R'
'L' 'L' 'L' 'L' 'L' 'R' 'R' 'R' 'L' 'L' 'L' 'R' 'R' 'R'] Confusion Matrix: [[ 0 6 7] [ 0 63 22] [
0 20 70]] Accuracy : 70.74468085106383 Report : precision recall f1-score support
B 0.00 0.00 0.00 13 L 0.71 0.74 0.72 85 R 0.71 0.78 0.74 90 accuracy 0.71 188
macro avg 0.47 0.51 0.49 188 weighted avg 0.66 0.71 0.68 188
NAIVE BAYES
Write a program to implement the naive Bayesian classifier for a sample training
data set stored as a .CSV file. Compute the NAIVE accuracy of the classifier,
considering few test data sets.
Description
31
Its mathematical formula is as follows: –
Where
Now, this Bayes theorem can be used to generate the following classification model –
Where
The Naive Bayes method makes the assumption that the predictors contribute equally
and independently to selecting the output class.
32
Although the Naive Bayes model’s assumption that all predictors are independent of
one another is unfeasible in real-world circumstances, this assumption produces a
satisfactory outcome in the majority of instances.
Naive Bayes is often used for text categorization since the dimensionality of the data
is frequently rather large.
This classifier is employed when the predictor values are continuous and are
expected to follow a Gaussian distribution.
When the predictors are boolean in nature and are supposed to follow the Bernoulli
distribution, this classifier is utilized.
This classifier makes use of a multinomial distribution and is often used to solve
issues involving document or text classification.
PROGRAM:
# Importing library
import math
import random
import csv
# Calculating Mean
def mean(numbers):
return sum(numbers) / float(len(numbers))
def MeanAndStdDev(mydata):
info = [(mean(attribute), std_dev(attribute)) for attribute in zip(*mydata)]
# eg: list = [ [a, b, c], [m, n, o], [x, y, z]]
# here mean of 1st attribute =(a + m+x), mean of 2nd attribute = (b +
n+y)/3 # delete summaries of last class
del info[-1]
return info
34
# Calculate Gaussian Probability Density Function
def calculateGaussianProbability(x, mean, stdev):
expo = math.exp(-(math.pow(x - mean, 2) / (2 * math.pow(stdev, 2))))
return (1 / (math.sqrt(2 * math.pi) * stdev)) * expo
# Accuracy score
def accuracy_rate(test,
predictions): correct = 0
for i in range(len(test)):
if test[i][-1] == predictions[i]:
correct += 1
return (correct / float(len(test))) * 100.0
# driver code
35
# load the file and store it in mydata list
mydata = csv.reader(open(filename, "rt"))
mydata = list(mydata)
mydata =
encode_class(mydata) for i in
range(len(mydata)):
mydata[i] = [float(x) for x in mydata[i]]
# prepare model
info = MeanAndStdDevForClass(train_data)
# test model
predictions = getPredictions(info, test_data)
accuracy = accuracy_rate(test_data, predictions)
print("Accuracy of your model is: ", accuracy)
OUTPUT:
Total number of examples are: 768 Out of these, training examples are: 537 Test
examples are: 231 Accuracy of your model is: 74.02597402597402
36
K-Nearest Neighbour algorithm:
PROGRAM:
# Import necessary modules
fromsklearn.neighbors importKNeighborsClassifier
fromsklearn.model_selection import train_test_split
fromsklearn.datasets importload_iris
# Loading data
irisData =load_iris()
# Create feature and target arrays
X =irisData.data
y =irisData.target
37
# Predict on dataset which model has not seen before
print(knn.predict(X_test))
fromsklearn.neighbors importKNeighborsClassifier
fromsklearn.model_selection importtrain_test_split
fromsklearn.datasets importload_iris
# Loading data
irisData =load_iris()
# Create feature and target arrays
X =irisData.data
y =irisData.target
38
knn.fit(X_train, y_train)
fromsklearn.neighbors importKNeighborsClassifier
fromsklearn.model_selection importtrain_test_split
fromsklearn.datasets importload_iris
importnumpy as np
importmatplotlib.pyplot as
plt irisData =load_iris()
# Create feature and target arrays
X =irisData.data
y =irisData.target
39
test_accuracy =np.empty(len(neighbors))
# Loop over K values
fori, k inenumerate(neighbors):
knn =KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)
# Compute training and test data accuracy
train_accuracy[i] =knn.score(X_train,
y_train) test_accuracy[i] =knn.score(X_test,
y_test)
# Generate plot
40
Output:
Here in the example shown above, we are creating a plot to see the k-value for
which we have high accuracy.
Note: This is a technique which is not used industry-wide to choose the correct
value of n_neighbors. Instead, we do hyperparameter tuning to choose the value that
gives the best performance. We will be covering this in future posts.
import pandas as pd
data = pd.read_csv("apples_and_oranges.csv")
41
Splitting the dataset into training and test samples
X_train = training_set.iloc[:,0:2].values
Y_train = training_set.iloc[:,2].values
X_test = test_set.iloc[:,0:2].values
Y_test = test_set.iloc[:,2].values
Y_pred = classifier.predict(X_test)
test_set["Predictions"] = Y_pred
42
Calculating the accuracy of the predictions
Output:
43
AIM: Demonstration of clustering algorithms using
a.K-Means
b. Hierarchical algorithms (agglomerative etc.). Interpret the clusters obtained.
K - means clustering
Aim – To demonstrate k-means clustering algorithm
Objective – Plotting of clusters using k-means clustering algorithm
Theory –
K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled dataset into different clusters.
Here K defines the number of pre-defined clusters that need to be created in the process, as if K=2, there will be two clusters,
and for K=3, there will be three clusters, and so on.
It allows us to cluster the data into different groups and a convenient way to discover the categories of groups in the
unlabeled dataset on its own without the need for any training. It is a centroid-based algorithm, where each cluster is associated
with a centroid. The main aim of this algorithm is to minimize the sum of distances between the data point and their
corresponding clusters.
Code –
pip install -U scikit-learn
#import requird libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
dataset = make_blobs(n_samples=200, centers=4, n_features=2, cluster_std=1.6,
random_state=50)
points = dataset[0]
# Import KMeans
from sklearn.cluster import KMeans
# Create kmeans object
kmeans = KMeans(n_clusters=4)
# fit the kmeans object to dataset
kmeans.fit(points)
plt.scatter(dataset[0][:,0], dataset[0][:,1])
clusters = kmeans.cluster_centers_
#prinrt out the clusters
print(clusters)
y_km = kmeans.fit_predict(points)
y_km
plt.scatter(points[y_km== 1,0], points[y_km == 1,1], s=50, color= 'blue')
plt.scatter(points[y_km== 0,0], points[y_km == 0,1], s=50, color= 'red')
plt.scatter(points[y_km== 2,0], points[y_km == 2,1], s=50, color= 'green')
plt.scatter(points[y_km== 3,0], points[y_km == 3,1], s=50, color= 'yellow')
plt.scatter(clusters[0][0], clusters[0][1], marker='*', s=200, color= 'black')
plt.scatter(clusters[1][0], clusters[1][1], marker='*', s=200, color= 'black')
plt.scatter(clusters[2][0], clusters[2][1], marker='*', s=200, color= 'black')
plt.scatter(clusters[3][0], clusters[3][1], marker='*', s=200, color= 'black')
plt.show()
OUTPUT –
44
Hierarchical Clustering
Aim – To demonstrate hierarchical clustering algorithm
Objective – Plotting of clusters using hierarchical clustering algorithm
Theory –
Hierarchical clustering is another unsupervised machine learning algorithm, which is used to group the
unlabeled datasets into a cluster and known as hierarchical cluster analysis or HCA. In this algorithm, we develop
the hierarchy of clusters in the form of a tree, and this tree shaped structure is known as the dendrogram.
Sometimes the results of K-means clustering and hierarchical clustering may look similar, but they both
differ depending on how they work. As there is no requirement to predetermine the number of clusters as we did in
the K-Means algorithm.
The hierarchical clustering technique has two approaches:
1. Agglomerative: Agglomerative is a bottom-up approach, in which the algorithm starts
with taking all data points as single clusters and merging them until one cluster is left.
2. Divisive: Divisive algorithm is the reverse of the agglomerative algorithm as it is a top-
45
down approach.
46
CODE –
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as
plt from sklearn import
datasets
# Import iris data
iris = datasets.load_iris()
iris_data = pd.DataFrame(iris.data)
iris_data.columns = iris.feature_names
iris_data['flower_type'] = iris.target
iris_data.head()
iris_X = iris_data.iloc[:, [0, 1, 2,3]].values
iris_Y = iris_data.iloc[:,4].values
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 7))
plt.scatter(iris_X[iris_Y == 0, 0], iris_X[iris_Y == 0, 1], s=100,
c='blue', label='Type 1')
plt.scatter(iris_X[iris_Y == 1, 0], iris_X[iris_Y == 1, 1], s=100,
c='yellow', label='Type 2')
plt.scatter(iris_X[iris_Y == 2, 0], iris_X[iris_Y == 2, 1], s=100,
c='green', label='Type 3')
plt.legend()
plt.show()
import scipy.cluster.hierarchy as sc
# Plot dendrogram
plt.figure(figsize=(20, 7))
plt.title("Dendrograms")
# Create dendrogram
sc.dendrogram(sc.linkage(iris_X, method='ward'))
plt.title('Dendrogram')
plt.xlabel('Sample index')
plt.ylabel('Euclidean distance')
from sklearn.cluster import AgglomerativeClustering
cluster = AgglomerativeClustering( n_clusters=3, affinity='euclidean',
linkage='ward')
cluster.fit(iris_X)
labels =
cluster.labels_ labels
plt.figure(figsize=(10, 7))
plt.scatter(iris_X[labels == 0, 0], iris_X[labels == 0, 1], s = 100, c = 'blue', label = 'Type 1')
plt.scatter(iris_X[labels == 1, 0], iris_X[labels == 1, 1], s = 100, c = 'yellow', label = 'Type 2')
plt.scatter(iris_X[labels == 2, 0], iris_X[labels == 2, 1], s = 100, c = 'green', label = 'Type 3')
plt.legend()
47
plt.show()
OUTPUT –
48
AIM: Demonstrate ensemble techniques like boosting, bagging, randomforests etc.
RANDOM FOREST
Implement Random Forest Algorithm using Python.
Description
Dataset: Breast Cancer Wisconsin (Diagnostic) Dataset
Let us have a quick look at the dataset:
49
PROGRAM:
import pandas as pd
df = pd.read_csv("diabetes.csv")
df.head()
df.isnull().sum()
X = df.drop("Outcome",axis="columns")
y = df.Outcome
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
X_scaled[:3]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, stratify=y, random_state=10)
X_train.shape
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
scores = cross_val_score(RandomForestClassifier(n_estimators=50), X, y, cv=5)
scores.mean()
OUTPUT:
50
51
BAGGING
52
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
adaboost = AdaBoostClassifier(n_estimators = 50, learning_rate = 0.2).fit(X_train, Y_train)
score = adaboost.score(X_test, Y_test)
print(score)
53
OUTPUT-
AIM: Build a classifier, compare its performance with an ensemble technique like
random forest.
RANDOM FOREST
Random Forest is a popular machine learning algorithm that belongs to the supervised learning
technique. It can be used for both Classification and Regression problems in ML. It is based on the
concept of ensemble learning, which is a process of combining multiple classifiers to solve a complex
problem and to improve the performance of the model. As the name suggests, "Random Forest is a
classifier that contains a number of decision trees on various subsets of the given dataset and takes the
average to improve the predictive accuracy of that dataset." Instead of relying on one decision tree, the
random forest takes the prediction from each tree and based on the majority votes of predictions, and
it predicts the final output. The greater number of trees in the forest leads to higher accuracy and
prevents the problem of overfitting.
Random Forest works in two-phase first is to create the random forest by combining N decision
tree, and second is to make predictions for each tree created in the first phase.
Code-
#DECISION TREE
55
import pandas as pd
df = pd.read_csv("/content/drive/MyDrive/Dataset/diabetes.csv")
df.head()
X = df.drop("Outcome",axis="columns")
y = df.Outcome
from sklearn.preprocessing import
StandardScaler scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
X_scaled[:3]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, stratify=y,
random_state=10)
X_train.shape
X_test.shape
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import cross_val_score
scores = cross_val_score(DecisionTreeClassifier(), X, y, cv=5)
scores
scores.mean()
#RANDOM FOREST
import pandas as pd
df = pd.read_csv("/content/drive/MyDrive/Dataset/diabetes.csv")
df.head()
df.isnull().sum()
X = df.drop("Outcome",axis="columns")
y = df.Outcome
from sklearn.preprocessing import
StandardScaler scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
X_scaled[:3]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, stratify=y,
random_state=10)
X_train.shape
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
scores = cross_val_score(RandomForestClassifier(n_estimators=50), X, y, cv=5)
scores.mean()
Output –
56
57
AIM: Evaluate various classification algorithms performance on a dataset using
various measures like True Positive Rate, False Positive Rate, Precision, Recall etc.
The most commonly used Performance metrics for classification problem are as
follows,
Accuracy
Confusion Matrix
ROC AUC
Log-loss
Accuracy
Accuracy is the simple ratio between the number of correctly classified points to the
total number of points.
Confusion Matrix
58
classification problem (2 classes) or multi-class classification problem (more than 2
59
classes)
TP means True Positive. It can be interpreted as the model predicted positive class
and it is True.
FP means False Positive. It can be interpreted as the model predicted positive class
but it is False.
FN means False Negative. It can be interpreted as the model predicted negative class
but it is False.
TN means True Negative. It can be interpreted as the model predicted negative class
and it is True.
Precision is the fraction of the correctly classified instances from the total classified
instances. Recall is the fraction of the correctly classified instances from the total
classified instances. Precision and recall are given as follows,
For example, consider that a search query results in 30 pages, out of which 20 are
relevant. And the results fail to display 40 other relevant results. So the precision is
60
20/30 and recall is 20/60.
Precision helps us understand how useful the results are. Recall helps us understand
how complete the results are.
But to reduce the checking of pockets twice, the F1 score is used. F1 score is the
harmonic mean of precision and recall. It is given as,
Here’s the quick way to compute true/false positives and true/false negatives.
Basically we will,
find the predicted and true labels that are assigned to some specific class
use the “AND” operator to combine the results into a single binary vector
sum over the binary vector to count how many incidences there are
61
# True Positive (TP): we predict a label of 1 (positive), and the true label is 1.
TP = np.sum(np.logical_and(pred_labels == 1, true_labels == 1))
# True Negative (TN): we predict a label of 0 (negative), and the true label is
0. TN = np.sum(np.logical_and(pred_labels == 0, true_labels == 0))
# False Positive (FP): we predict a label of 1 (positive), but the true label is 0.
FP = np.sum(np.logical_and(pred_labels == 1, true_labels == 0))
# False Negative (FN): we predict a label of 0 (negative), but the true label is
1. FN = np.sum(np.logical_and(pred_labels == 0, true_labels == 1))
print 'TP: %i, FP: %i, TN: %i, FN: %i' % (TP,FP,TN,FN)
CODE –
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_curve
X, label = make_classification(n_samples=500, n_classes=2, weights=[1,1], random_state=100)
X_train, X_test, y_train, y_test = train_test_split(X, label, test_size=0.3, random_state=1)
model = LogisticRegression()
model.fit(X_train, y_train)
probs = model.predict_proba(X_test)
probs = probs[:, 1]
fpr, tpr, thresholds = roc_curve(y_test, probs)
plt.figure(figsize = (10,6))
plt.plot(fpr, tpr, color='red', label='ROC')
plt.plot([0, 1], [0, 1], color='darkblue',
linestyle='--') plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic Curve')
62
plt.legend()
plt.show()
from sklearn import datasets
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_recall_curve
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
import matplotlib.pyplot as plt
data = datasets.load_breast_cancer()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target
X_train, X_test, y_train, y_test = train_test_split(df.iloc[:,:-1], df.iloc[:,-1], test_size=0.3,
random_state=42)
model = LogisticRegression()
model.fit(X_train, y_train)
pred = model.predict(X_test)
precision = precision_score(y_test,
pred) recall = recall_score(y_test, pred)
print('Precision: ',precision)
print('Recall: ',recall)
OUTPUT:
63
64
AIM: Demonstrate GA for optimization (Minimization or Maximization problem).
Initialize a population N
PROGRAM:
import random
def fitness_function(x):
return x**2 - 4*x
def generate_individual():
return random.uniform(-10, 10)
def generate_population(population_size):
return [generate_individual() for _ in range(population_size)]
def evaluate_population(population):
return [fitness_function(individual) for individual in population]
def select_parents(population, fitness_values):
# Using roulette wheel selection
total_fitness = sum(fitness_values)
probabilities = [fitness / total_fitness for fitness in fitness_values]
return random.choices(population, probabilities, k=2)
def crossover(parent1, parent2):
# Using simple average crossover
offspring1 = (parent1 + parent2) / 2.0
offspring2 = (parent1 + parent2) / 2.0
65
return offspring1, offspring2
def mutate(individual, mutation_rate):
# Perturb individual's solution by adding a small random value
66
if random.random() < mutation_rate:
individual += random.uniform(-1, 1)
return individual
def replace_population(population, offspring):
population_size = len(population)
return population[:population_size - len(offspring)] + offspring
def genetic_algorithm(population_size, num_generations):
population = generate_population(population_size)
for _ in range(num_generations):
fitness_values = evaluate_population(population)
parents = select_parents(population, fitness_values)
offspring = []
while len(offspring) < population_size:
parent1, parent2 = parents
child1, child2 = crossover(parent1,
parent2) child1 = mutate(child1,
mutation_rate=0.1) child2 = mutate(child2,
mutation_rate=0.1)
offspring.extend([child1, child2])
population = replace_population(population, offspring)
# Find the best individual in the final population
fitness_values = evaluate_population(population)
best_individual = population[fitness_values.index(max(fitness_values))]
return best_individual
# Example usage
population_size = 100
num_generations = 50
best_solution = genetic_algorithm(population_size, num_generations)
best_fitness = fitness_function(best_solution)
print("Best solution:", best_solution)
print("Best fitness:", best_fitness)
Output
67
AIM: Case study on supervised/unsupervised learning algorithm
In the real world, we are surrounded by humans who can learn everything from
their experiences with their learning capability, and we have computers or machines
which work on our instructions. But can a machine also learn from experiences or past
data like a human does? So here comes the role of Machine Learning.
With the help of sample historical data, which is known as training data, machine
learning algorithms build a mathematical model that helps in making predictions or
decisions without being explicitly programmed. Machine learning brings computer science
and statistics together for creating predictive models. Machine learning constructs or uses
the algorithms that learn from historical data. The more we will provide the information,
the higher will be the performance.
68
A machine has the ability to learn if it can improve its performance by gaining more
data.
69
How does Machine Learning work
A Machine Learning system learns from historical data, builds the prediction models,
and whenever it receives new data, predicts the output for it. The accuracy of
predicted output depends upon the amount of data, as the huge amount of data helps to
build a better model which predicts the output more accurately.
o It is a data-driven technology.
o Machine learning is much similar to data mining as it also deals with the huge
amount of the data.
The need for machine learning is increasing day by day. The reason behind the need for
machine learning is that it is capable of doing tasks that are too complex for a person to
implement directly. As a human, we have some limitations as we cannot access the huge
amount of data manually, so for this, we need some computer systems and here comes the
machine learning to make things easy for us.
We can train machine learning algorithms by providing them the huge amount of data and
let them explore the data, construct the models, and predict the required output
automatically. The performance of the machine learning algorithm depends on the amount
of data, and it can be determined by the cost function. With the help of machine learning,
we can save both time and money.
70
The importance of machine learning can be easily understood by its uses cases, Currently,
machine learning is used in self-driving cars, cyber fraud detection, face recognition,
and friend suggestion by Facebook, etc. Various top companies such as Netflix and
71
Amazon have build machine learning models that are using a vast amount of data to
analyze the user interest and recommend product accordingly.
Following are some key points which show the importance of Machine Learning:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
1) Supervised Learning
The system creates a model using labeled data to understand the datasets and learn about
each data, once the training and processing are done then we test the model by providing a
sample data to check whether it is predicting the exact output or not.
72
The goal of supervised learning is to map input data with the output data. The supervised
learning is based on supervision, and it is the same as when a student learns things in the
supervision of the teacher. The example of supervised learning is spam filtering.
o Classification
o Regression
2) Unsupervised Learning
The training is provided to the machine with the set of data that has not been labeled,
classified, or categorized, and the algorithm needs to act on that data without any
supervision. The goal of unsupervised learning is to restructure the input data into new
features or a group of objects with similar patterns.
In unsupervised learning, we don't have a predetermined result. The machine tries to find
useful insights from the huge amount of data. It can be further classifieds into two
categories of algorithms:
o Clustering
o Association
3) Reinforcement Learning
The robotic dog, which automatically learns the movement of his arms, is an example of
Reinforcement learning.
Before some years (about 40-50 years), machine learning was science fiction, but today it
is the part of our daily life. Machine learning is making our day to day life easy from self-
driving cars to Amazon virtual assistant "Alexa". However, the idea behind machine
73
learning is so old and has a long history. Below some milestones are given which have
occurred in the history of machine learning:
74
The early history of Machine Learning (Pre-1940):
o 1834: In 1834, Charles Babbage, the father of the computer, conceived a device that
could be programmed with punch cards. However, the machine was never built, but
all modern computers rely on its logical structure.
o 1936: In 1936, Alan Turing gave a theory that how a machine can determine and
execute a set of instructions.
76
The first "AI" winter:
o The duration of 1974 to 1980 was the tough time for AI and ML researchers, and
this duration was called as AI winter.
o In this duration, failure of machine translation occurred, and people had reduced
their interest from AI, which led to reduced funding by the government to the
researches.
Machine learning is a buzzword for today's technology, and it is growing very rapidly day
by day. We are using machine learning in our daily life even without knowing it such as
Google Maps, Google assistant, Alexa, etc. Below are some most trending real-world
applications of Machine Learning:
1. Image Recognition:
Image recognition is one of the most common applications of machine learning. It is used
to identify objects, persons, places, digital images, etc. The popular use case of image
recognition and face detection is, Automatic friend tagging suggestion:
It is based on the Facebook project named "Deep Face," which is responsible for face
recognition and person identification in the picture.
2. Speech Recognition
While using Google, we get an option of "Search by voice," it comes under speech
recognition, and it's a popular application of machine learning.
Speech recognition is a process of converting voice instructions into text, and it is also
known as "Speech to text", or "Computer speech recognition." At present, machine
learning algorithms are widely used by various applications of speech recognition. Google
assistant, Siri, Cortana, and Alexa are using speech recognition technology to follow the
voice instructions.
77
3. Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us the correct
78
path with the shortest route and predicts the traffic conditions.
It predicts the traffic conditions such as whether traffic is cleared, slow-moving, or heavily
congested with the help of two ways:
Real Time location of the vehicle form Google Map app and sensors
Average time has taken on past days at the same time.
Everyone who is using Google Map is helping this app to make it better. It takes
information from the user and sends back to its database to improve the performance.
4. Product recommendations:
79