AI Lab Manual For V 5SEM PDF

DEPARTMENT OF COMPUTERSCIENCE AND
ARTIFICIAL INTELLIGENCE (CS&AI)
ARTIFICIAL INTELLIGENCE LAB MANUAL

(PC551CSM)B.E V SEM(2023-2024)
ARTIFICIAL INTELLIGENCE LAB
OBJECTIVES:
The course should enable the students:

1. To apply programming skills to formulate the solutions for computational problems.
2. To study implementation first order predicate calculus using Prolog.
3. To familiarize with basic implementation of NLP with the help of Python libraries NLTK.
4. To understand python library scikit-learn for building machine learning models.
5. To enrich knowledge to select and apply relevant AI tools for the given problem.
COURSEOUTCOMES:
At the end of the course, student will be able to:

1. Design and develop solutions for informed and uninformed search problems in AI.
2. Demonstrate reasoning in first order logic using Prolog.
3. Utilize advanced package like NLTK for implementing natural language processing.
4. Demonstrate and enrich knowledge to select and apply python libraries to synthesize information
and develop supervised learning models.
5. Develop a case study in multidisciplinary areas to demonstrate use of AI.
Recommended System/ SoftwareRequirements:
 IntelbaseddesktopPCwithminimumof2.6GHZorfaster processor with

atleast4GB RAMand200 GB free disk space.
 Operating System :Flavor of any WINDOWS
 Software: Python IDLE /PyCharm/Jupiter Notebook
INDEX
Experiment Name of the Experiments Page No.

S.No.
No.
BASICS 1
Write a program to implement Uninformed search techniques:

1 1
a. BFS
b. DFS
Programs for search techniques 3
Write a program to implement Informed search techniques.
2 2 a. a. Greedy Best first search.
b. b. A* algorithm.
c.
Study of Prolog, its facts, and rules 26
a. Write simple facts for the statements and querying it.

3 3
b. Write a program for Family-tree.
35
program to train and validate the following classifiers for given data
(scikit-learn):
4 4
a. Decision Tree.
b. Multi-layer Feed Forward neural network.
Text processing using NLTK

a. Remove stop words
5 5 b. Implement stemming
c. POS (Parts of Speech) tagging
Program to implement Uninformed search techniques:
Program 1.a
AIM: Breadth-First Search (BFS) is an algorithm used for traversing or searching tree or graph data
structures. It starts at the tree root (or some arbitrary node a graph) and explores all of the neighbor nodes at
the present depth prior to moving on to nodes at the next depth level.
CODE:
from collections import deque
# Define a graph as an adjacency list
graph = {
'A': ['B', 'C'],
'B': ['A', 'D', 'E'],
'C': ['A', 'F'],
'D': ['B'],
'E': ['B', 'F'],
'F': ['C', 'E']
def bfs(graph, start):
visited = set() # Set to keep track of visited nodes
queue = deque([start]) # Initialize the queue with the start node
while queue:
vertex = queue.popleft() # Get the next vertex from the queue
if vertex not in visited:
print(vertex) # Process the current vertex
visited.add(vertex)
# Add the unvisited neighbors to the queue
for neighbor in graph[vertex]:
if neighbor not in visited:
queue.append(neighbor)
# Start BFS from vertex 'A'
bfs(graph, 'A')
1
Out Put:
A
B
C
D
E
F
Program 1.b
AIM :
The DFS is an algorithm used to traverse the target node in a graph or tree data structure. The
depth-first search derives from the word "depth". Its priorities depth and searches along one
branch, as far as it can go until the end of the branch. In Python, we can easily implement it using
the recursion and other data structures like dictionaries and sets.
CODE
# Define a weighted graph as an adjacency list with weights
graph = {
'A': [('B', 1), ('C', 2)],
'B': [('A', 1), ('D', 3), ('E', 4)],
'C': [('A', 2), ('F', 5)],
'D': [('B', 3)],
'E': [('B', 4), ('F', 6)],
'F': [('C', 5), ('E', 6)]
}
def dfs_weighted(graph, start, visited=None, total_weight=0):
if visited is None:
visited = set()
visited.add(start)
print(f"Visiting {start}, Total Weight: {total_weight}")
for neighbor, weight in graph[start]:
if neighbor not in visited:
dfs_weighted(graph, neighbor, visited, total_weight + weight)
# Start DFS from vertex 'A' with a total weight of 0
dfs_weighted(graph, 'A', total_weight=0)
Out put
Visiting A, Total Weight: 0
Visiting B, Total Weight: 1
Visiting D, Total Weight: 4
Visiting E, Total Weight: 5
Visiting F, Total Weight: 11
Visiting C, Total Weight: 16
2
Informed Search strategies
AIM: Greedy Best-First Search is an AI search algorithm that attempts to find the most promising path from
a given starting point to a goal. The algorithm works by evaluating the cost of each possible path and then
expanding the path with the lowest cost.
a.Greedy Best First Search:

DESCRIPTION:
ALGORITHM:
Let ‘OPEN’ be a priority queue containing initial state
LOOP
If OPEN is empty return failure
Node Remove-First (OPEN)
If Node is a Goal
Then return the path from initial to Goal
Else generate all successors of Node and
Put the newly generated Node into OPEN according to their f values.
END loop
CODE:
3
EXPECTED OUTPUT:
Path from A to G is: A -> C -> E -> G
4
b.A* algorithm.
It is a searching algorithm that is used to find the shortest path between an initial and a final
point. It is a handy algorithm that is often used for map traversal to find the shortest path to
be taken. A* was initially designed as a graph traversal problem, to help build a robot that
can find its own course.
CODE:
import heapq
def astar_search(graph, start, goal, heuristic):
open_list = [(0 + heuristic[start], 0, start)]
came_from = {}
g_score = {node: float('inf') for node in graph}
g_score[start] = 0
while open_list:
, current_cost, current_node = heapq.heappop(open_list)
if current_node == goal:
path = reconstruct_path(came_from, current_node)
return path, current_cost
for neighbor, cost in graph[current_node].items():
tentative_g_score = g_score[current_node] + cost
if tentative_g_score < g_score[neighbor]:
g_score[neighbor] = tentative_g_score
f_score = g_score[neighbor] + heuristic[neighbor]
heapq.heappush(open_list, (f_score, tentative_g_score, neighbor))
came_from[neighbor] = current_node
return None, float('inf') # Return None and infinity if no path is found
def reconstruct_path(came_from, current_node):
path = [current_node]
while current_node in came_from:
current_node = came_from[current_node]
path.insert(0, current_node)
return path
# Example usage:
# Define your graph as an adjacency list
graph = {
'A': {'B': 5, 'C': 3},
'B': {'D': 8, 'E': 6},
5
'C': {'E': 2, 'F': 4},
'D': {'G': 9},
'E': {'G': 7},
'F': {},
'G': {}
}
# Define your heuristic values for each node
heuristic = {
'A': 10,
'B': 8,
'C': 7,
'D': 6,
'E': 4,
'F': 3,
'G': 0
}
start_node = 'A'
goal_node = 'G'
path, total_cost = astar_search(graph, start_node, goal_node, heuristic)
if path:
print(f'Path from {start_node} to {goal_node}: {" -> ".join(path)}')
print(f'Lowest total cost: {total_cost}')
else:
print(f'No path from {start_node} to {goal_node} found.').
Introduction to Prolog:
Prolog (Programming in Logic) is a programming language particularly well-suited for
symbolic reasoning and manipulation. It is based on formal logic and is widely used in
artificial intelligence and natural language processing. In Prolog, you define relations and
rules, and the system can then query these relations to derive new information.
Basics of Prolog
FACTS: In Prolog, you declare facts about relationships and properties. Facts are statements
that are true.
Example: raining.
6
QUERIES: Prolog can be queried to find relationships based on the defined facts and rules.
Example:?- raining.
?- is the Prolog prompt. To this query, Prolog will answer yes. Raining is true because (from
above) Prolog matches it in its database of facts.
Rules: You can define rules that express relationships based on facts.
Example: relation(<argument1>,<argument2>,....,<argumentN> ).
Relation names must begin with a lowercase letter
likes(john,mary).
Rules: likes(X, Y) :- loves(X, Y).

likes(X, Y) :- loves(Y, X).
Note: Facts have some simple rules of syntax. Facts should always begin with a lowercase
letter and end with a full stop. The facts themselves can consist of any letter or number
combination, as well as the underscore _ character. However, names containing the
characters -,+,*,/, or other mathematical operators should be avoided.
Facts with Arguments Examples
An example database.It details who eats what in some world model.

eats(fred,oranges). /* "Fred eats oranges" */
eats(fred,t_bone_steaks). /* "Fred eats T-bone steaks" */
eats(tony,apples). /* "Tony eats apples" */
eats(john,apples). /* "John eats apples" */
eats(john,grapefruit). /* "John eats grapefruit" */
If we now ask some queries we would get the following interaction:
?- eats(fred,oranges). /* does this match anything in the database? */
yes /* yes, matches the first clause in the database */
?- eats(john,apples). /* do we have a fact that says john eats apples? */
yes /* yes we do, clause 4 of our eats database */
?- eats(mike,apples). /* how about this query, does mike eat apples */
no /* not according to the above database. */
7
?- eats(fred,apples). /* does fred eat apples */
no /* again no, we don't know whether fred eats apples */
Facts and queries
 facts
sing_a_song(ananya).
listens_to_music(rohit).
listens_to_music(ananya) :- sing_a_song(ananya).
happy(ananya) :- sing_a_song(ananya).
happy(rohit) :- listens_to_music(rohit).
playes_guitar(rohit) :- listens_to_music(rohit).
 Queries
 | ?- happy(rohit).
 | ?- sing_a_song(rohit).
 | ?- sing_a_song(ananya).
 | ?- playes_guitar(rohit).
 | ?- playes_guitar(ananya).
 | ?- listens_to_music(ananya).
2) FACTs
woman(mia).
woman(jody).
woman(yolanda).
playsAirGuitar(jody).
8
 Queries
?- woman(mia).
?- playsAirGuitar(jody).
?- playsAirGuitar(mia).
?- playsAirGuitar(vincent).
b. Write a Program For Family Tree.
AIM: A simple example of a family tree program in Prolog.
DESCRIPTION: In Prolog, you can represent a family tree using predicates to define
relationships.
In this example:
 male/1 and female/1 are used to define the gender of individuals.

 parent/2 states the parent-child relationships.
 father/2 and mother/2 are rules to define father and mother relationships.
 sibling/2 is a rule to define sibling relationships.
 The queries demonstrate how you can use these predicates and rules to ask questions about
the family relationships.
CODE:
% Facts
male(john).
male(bob).
male(jim).
female(jane).
female(susan).
female(emily).
9
parent(john, bob).
parent(john, jim).
parent(jane, bob).
parent(jane, jim).
parent(bob, jack).
parent(susan, jack).
parent(jim, emily).
% Rules
father(X, Y) :- male(X), parent(X, Y).
mother(X, Y) :- female(X), parent(X, Y).
sibling(X, Y) :- parent(Z, X), parent(Z, Y), X \= Y.
grandparent(X, Z) :- parent(X, Y), parent(Y, Z).
% Queries
?- father(john, bob).
% true
?- mother(jane, jim).
% ----
?- sibling(bob, jim).
% ----
?- grandparent(john, jack).
% ----
?- grandparent(susan, emily).
% ----
10
Experiment 4
AIM: Write a program to train and validate the following classifiers on given dataset(Scikit-
learn).
a) Decision Tree
The aim of a decision tree classifier is to make predictions or decisions based on a set of input
features. Decision trees are a popular machine learning algorithm used for both classification and
regression tasks. Here are the primary goals or aims of a decision tree classifier.
DESCRIPTION:
The program consist of:

Import the libraries
Load the iris dataset from scikit learn
Split the dataset into train and test
Create a DT classifier
Train the classifier
Make the predictions
Evaluate Accuracy.
11
12
EXPECTED OUTPUT:
13
Text processing using NLTK .
a. Remove stop words : In natural language processing (NLP), stop words refer to common
words that are often removed from text data during the preprocessing stage. These words are
considered to be of little value in terms of information retrieval or text analysis because they
are frequently used in a language and don't typically contribute much to the meaning of a
sentence. Examples of stop words in English include "the," "and," "is," "in," and so
on.Removing stop words in NLP involves filtering out these common words from a given
text to focus on the more meaningful words.
CODE :
import nltk
nltk.download('punkt') # Download necessary resources
REMOVING STOPWORDS LIKE is,an,a
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
nltk.download('stopwords')
nltk.download('punkt')
def remove_stop_words(text):
# Tokenize the text
words = word_tokenize(text)
# Get the list of English stop words
stop_words = set(stopwords.words('english'))
# Remove stop words
filtered_words = [word for word in words if word.lower() not in stop_words]
# Join the filtered words back into a sentence
filtered_text = ' '.join(filtered_words)
return filtered_text
# Example usage
input_text = "This is an example sentence with some stop words."
output_text = remove_stop_words(input_text)
print("Original text:", input_text)
print("Text without stop words:", output_text)
14
EXPECTED OUT PUT
Original Words:
['programming', 'programmed', '.']
Stemmed Words:
['program', 'program', '.']
-(: ->-
b. Implement stemming
import nltk
from nltk.stem import PorterStemmer
nltk.download('punkt')
# Example text
text = "programming programmed."
# Tokenize the text into words
# Create a Porter stemmer
stemmer = PorterStemmer()
# Apply stemming to each word
stemmed_words = [stemmer.stem(word) for word in words]
# Print the results
print("Original Words:")
print(words)
print("\nStemmed Words:")
print(stemmed_words)
15
POS (Parts of Speech) tagging
import nltk
# Download the missing resource
nltk.download('averaged_perceptron_tagger')
# Now, you should be able to use the pos_tag function without errors
from nltk import pos_tag
# Example text
text = "the cat sat on the mat."
# Tokenize the text into words
# Part-of-speech tagging
tagged_words = pos_tag(words)
print("Part-of-Speech Tags:", tagged_words)
AIM: Programs for classification
a. Build models suing linear regression and logistic

regression and apply it to classify a new instance.
b. Write a program to demonstrate the following

classifiers. Use an appropriate dataset for building the
model. Apply the model to classify a new instance.
i. Decision Tree
ii. K-Nearest Neighbour
iii. Naïve Bayes
iv. Support Vector Machine
Build models using linear regression and logistic regression

and apply it to classify a new instance.
Linear Regression
The term "linearity" in algebra refers to a linear relationship between

two or more variables. If we draw this relationship in a two
dimensional space (between two variables, in this case), we get a
straight line.
Let's consider a scenario where we want to determine the

linear relationship between the numbers of hours a student
16
studies and the percentage of marks that student scores in an
exam. We want to find out that given the number of hours a
student prepares for a test, about how high of a score can the
student achieve? If we plot the independent variable (hours)
on the x-axis and dependent variable (percentage) on the y-
axis, linear regression gives us a straight line that best fits the
data points, as shown in the figure below.
We know that the equation of a straight line is basically:
y = mx +b
Program
import numpy as np
import
matplo
tlib.py
plot as
plt def
estimat
e_coef(
x, y):
# number
of
observati
ons/point
s n =
np.size(x)
#
m
ea
n
of
x
a
n
d
y
v
ec
to
r
17
m
_
x
=
n
p.
m
ea
n(
x)
m_y = np.mean(y)
18
# calculating cross-deviation and deviation about x
SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x
# calculating regression coefficients
b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x
return (b_0, b_1)
def plot_regression_line(x, y, b):
# plotting the actual points as scatter plotplt.scatter(x, y, color =

"m", # calculating cross-deviation and deviation about x
SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x
# calculating regression coefficients
b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x
return (b_0, b_1)
def plot_regression_line(x, y, b):
# plotting the actual points as scatter plot

plt.scatter(x, y, color = "m",
marker = "o", s = 30)
# predicted response vector
y_pred = b[0] + b[1]*x
# plotting the regression line
plt.plot(x, y_pred, color = "g")
19
# putting labels
plt.xlabel('x')
plt.ylabel('y')
# function to show plot

plt.show()
def main():
# observations / data
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])
# estimating coefficients
b = estimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {} \
\nb_1 = {}".format(b[0], b[1]))

# plotting regression line
plot_regression_line(x, y, b)
if name == " main ":
main()
Output
Estimated coefficients:
b_0 = -0.0586206896552
b_1 = 1.45747126437
20
Logistic Regression
The dataset contains information of users from a company‘s database. It contains

information about UserID, Gender, Age, Estimated Salary, and Purchased. Use this
dataset for predicting that a user will purchase the company‘s newly launched
productor not by Logistic Regression model.
PROGRAM
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
dataset =
pd.read_csv("User_Data.csv") # input
x = dataset.iloc[:, [2, 3]].values
# output
y = dataset.iloc[:, 4].values
from sklearn.model_selection import train_test_split
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.25, random_state=0)

from sklearn.preprocessing import StandardScaler
sc_x = StandardScaler()
xtrain = sc_x.fit_transform(xtrain)
xtest = sc_x.transform(xtest)
print (xtrain[0:10, :])
Output:
[[ 0.58164944 -0.88670699]
[-0.60673761 1.46173768]
[-0.01254409 -0.5677824 ]
[-0.60673761 1.89663484]
[ 1.37390747 -1.40858358]
[ 1.47293972 0.99784738]
[ 0.08648817 -0.79972756]
[-0.01254409 -0.24885782]
[-0.21060859 -0.5677824 ]
[-0.21060859 -0.19087153]]
from sklearn.linear_model import LogisticRegression
21
classifier = LogisticRegression(random_state = 0)
classifier.fit(xtrain, ytrain)
22
y_pred = classifier.predict(xtest)
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(ytest, y_pred)
print ("Confusion Matrix : \n", cm)
Output:
Confusion Matrix :
[[65 3]
[ 8 24]]
from sklearn.metrics import accuracy_score
print ("Accuracy : ", accuracy_score(ytest, y_pred))
Output:
Accuracy : 0.89
from matplotlib.colors import
ListedColormap X_set, y_set = xtest, ytest
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1,
stop = X_set[:, 0].max() + 1, step = 0.01),
np.arange(start = X_set[:, 1].min() - 1,
stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(

np.array([X1.ravel(), X2.ravel()]).T).reshape(
X1.shape), alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Classifier (Test set)')

plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()
23
OUTPUT:
24
Decision Tree
 The Decision Tree Basically is an inverted tree, with each node

representing features and attributes.
 While the leaf node represents the output
 Except for the leaf node, the remaining nodes act as decision making nodes.
Algorithms
 CART (Gini Index)

 ID3 (Entropy, Information
Gain) Algorithm Concepts
1. To understand this concept, we take an example, assuming we have a data set.
2. Based on this data, we have to find out if we can play someday or not.
3. We have four attributes in the data-set. Now how do we decide which attribute we
should put on the root node?
4. For this, we will Calculate the information gain of all the attributes
(Features), which will have maximum information will be our root node.
Step1 : Creating a root node
Entorpy(Entropy of whole data-set)
Entropy (S) = (-p/p+n)*log2 (p/p+n) - (n/n+p)*log2 ((n/n+p))

p- p stand for number of positive examples
n- n stand for number of negative examples.
Step2: For Every Attribute/Features
Average Information (AIG of a particular attribute)
I(Attribute) = Sum of {(pi+ni/p+n)*Entropy(Entropy of Attribute)}
pi- Here pi stand for number of positive examples in particular attribute.
ni- Here ni stand for number of negative examples in particular attribute.
Entropy (Attribute) - Entropy of Attribute calculated in same as we calculated for

System (Whole Data-Set)
Information Gain
Gain= Entropy(S)-I(Attribute)
1. If all examples are positive, Return the single-node tree ,with label=+
2. If all examples are Negative, Return the single-node tree,with label= -
25
3. If Attribute empty, Return the single-node tree
Step4: Pick The Highest Gain Attribute
1. The attribute that has the most information gain has to create a group of all the
its attributes and process them in same as which we have done for the parent
(Root) node.
2. Again, the feature which has maximum information gain will become a node and
this process will continue until we get the leaf node.
Step5: Repeat Until we get final node (Leaf node )
Let's take a look how our tree will look like
PROGRAM:
# Importing the required packages
import numpy as
np import pandas
as pd
from sklearn.model_selection import
train_test_split from sklearn.tree import
DecisionTreeClassifier from sklearn.metrics import
accuracy_score
# Function importing Dataset
def importdata():
balance_data = pd.read_csv(
'https://archive.ics.uci.edu/ml/machine-learning-'+
'databases/balance-scale/balance-scale.data',
sep= ',', header = None)
# Printing the dataset shape

26
print ("Dataset Length: ", len(balance_data))
print ("Dataset Shape: ", balance_data.shape)
# Printing the dataset obseravtions
print ("Dataset:
",balance_data.head()) return
balance_data
# Function to split the dataset
def splitdataset(balance_data):
# Separating the target variable
X = balance_data.values[:, 1:5]
Y = balance_data.values[:, 0]
# Splitting the dataset into train and test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.3, random_state

= 100)
return X, Y, X_train, X_test, y_train, y_test
# Function to perform training with giniIndex.
def train_using_gini(X_train, X_test, y_train):

# Creating the classifier object
clf_gini = DecisionTreeClassifier(criterion = "gini", random_state = 100,
max_depth=3,
min_samples_leaf=5)
# Performing training
clf_gini.fit(X_train, y_train)
return clf_gini
# Function to perform training with entropy.
def tarin_using_entropy(X_train, X_test, y_train):
# Decision tree with entropy
clf_entropy = DecisionTreeClassifier(criterion = "entropy",

random_state = 100,
max_depth = 3,
min_samples_leaf =
5) # Performing training
27
28
clf_entropy.fit(X_train, y_train)
return clf_entropy
# Function to make predictions
def prediction(X_test,
clf_object): # Predicton on test
with giniIndex
y_pred =
clf_object.predict(X_test)
print("Predicted values:")
print(y_pred)
# Function to calculate accuracy
def cal_accuracy(y_test, y_pred):
print("Confusion Matrix: ",

confusion_matrix(y_test, y_pred))
print ("Accuracy : ",

accuracy_score(y_test,y_pred)*100)
print("Report : ",
classification_report(y_test, y_pred))
# Driver code
def main():
# Building Phase
data = importdata()
X, Y, X_train, X_test, y_train, y_test = splitdataset(data)
clf_gini = train_using_gini(X_train, X_test, y_train)
clf_entropy = tarin_using_entropy(X_train, X_test, y_train)
# Operational Phase
print("Results Using Gini Index:")
# Prediction using gini

y_pred_gini = prediction(X_test,
clf_gini) cal_accuracy(y_test,
y_pred_gini)
print ("Results Using
Entropy:") # Prediction using
entropy
y_pred_entropy = prediction(X_test, clf_entropy)
# Calling main function
if name ==" main ":
main()
29
30
Output:
Dataset Length: 625 Dataset Shape: (625, 5) Dataset: 0 1 2 3 4 0 B 1 1 1 1 1 R 1 1 1

22R11133R11144R1115
Results Using Gini Index: Predicted values: ['R' 'L' 'R' 'R' 'R' 'L' 'R' 'L' 'L' 'L' 'R' 'L'
'L' 'L' 'R' 'L' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'L' 'R' 'L' 'L' 'L' 'R' 'L' 'L' 'L' 'R' 'L' 'L' 'L' 'L' 'R' 'L'
'L' 'R' 'L' 'R' 'L' 'R' 'R' 'L' 'L' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'L' 'L' 'R' 'R' 'L' 'L' 'L' 'L'
'L' 'R' 'R' 'L' 'L' 'R' 'R' 'L' 'R' 'L' 'R' 'R' 'R' 'L' 'R' 'L' 'L' 'L' 'L' 'R' 'R' 'L' 'R' 'L' 'R' 'R' 'L'
'L' 'L' 'R' 'R' 'L' 'L' 'L' 'R' 'L' 'R' 'R' 'R' 'R' 'R' 'R' 'R' 'L' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'R' 'R'
'R' 'L' 'R' 'L' 'L' 'L' 'L' 'L' 'L' 'L' 'R' 'R' 'R' 'R' 'L' 'R' 'R' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'L'
'R' 'L' 'L' 'R' 'L' 'R' 'L' 'R' 'R' 'R' 'L' 'R' 'R' 'R' 'R' 'R' 'L' 'L' 'R' 'R' 'R' 'R' 'L' 'R' 'R' 'R'
'L' 'R' 'L' 'L' 'L' 'L' 'R' 'R' 'L' 'R' 'R' 'L' 'L' 'R' 'R' 'R']
Confusion Matrix: [[ 0 6 7] [ 0 67 18] [ 0 19 71]]
Accuracy : 73.40425531914893
Report : precision recall f1-score support B 0.00 0.00 0.00 13 L 0.73 0.79 0.76 85 R
0.74 0.79 0.76 90
accuracy 0.73 188 macro avg 0.49 0.53 0.51 188 weighted avg 0.68 0.73 0.71 188
Results Using Entropy: Predicted values: ['R' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'R' 'R' 'R' 'L'
'L' 'R' 'L' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'L' 'L' 'L' 'L' 'R' 'L'
'R' 'L' 'R' 'L' 'R' 'R' 'L' 'L' 'R' 'L' 'L' 'R' 'L' 'L' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'R' 'L' 'L' 'R' 'L' 'L'
'R' 'L' 'L' 'L' 'R' 'R' 'L' 'R' 'L' 'R' 'R' 'R' 'L' 'R' 'L' 'L' 'L' 'L' 'R' 'R' 'L' 'R' 'L' 'R' 'R' 'L' 'L'
'L' 'R' 'R' 'L' 'L' 'L' 'R' 'L' 'L' 'R' 'R' 'R' 'R' 'R' 'R' 'L' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'L'
'R' 'R' 'R' 'L' 'L' 'L' 'L' 'L' 'R' 'R' 'R' 'R' 'L' 'R' 'R' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'R' 'L'
'L' 'R' 'L' 'R' 'R' 'R' 'R' 'R' 'L' 'R' 'R' 'R' 'R' 'R' 'R' 'L' 'R' 'L' 'R' 'R' 'L' 'R' 'L' 'R' 'L' 'R'
'L' 'L' 'L' 'L' 'L' 'R' 'R' 'R' 'L' 'L' 'L' 'R' 'R' 'R'] Confusion Matrix: [[ 0 6 7] [ 0 63 22] [
0 20 70]] Accuracy : 70.74468085106383 Report : precision recall f1-score support
B 0.00 0.00 0.00 13 L 0.71 0.74 0.72 85 R 0.71 0.78 0.74 90 accuracy 0.71 188
macro avg 0.47 0.51 0.49 188 weighted avg 0.66 0.71 0.68 188
NAIVE BAYES
Write a program to implement the naive Bayesian classifier for a sample training
data set stored as a .CSV file. Compute the NAIVE accuracy of the classifier,
considering few test data sets.
Description
Naive Bayes is a basic but effective probabilistic classification model in machine

learning that draws influence from Bayes Theorem.
Bayes theorem is a formula that offers a conditional probability of an event A taking

happening given another event B has previously happen
31
Its mathematical formula is as follows: –
Where
A and B are two events
P(A|B) is the probability of event A provided event B has already

happened. P(B|A) is the probability of event B provided event A has
already happened. P(A) is the independent probability of A
P(B) is the independent probability of B
Now, this Bayes theorem can be used to generate the following classification model –
Where
X = x1,x2,x3,.. xN аre list оf indeрendent рrediсtоrs

y is the class label
P(y|X) is the probability of label y given the predictors X
The above equation may be extended as follows:
Characteristics of Naive Bayes Classifier
The Naive Bayes method makes the assumption that the predictors contribute equally
and independently to selecting the output class.
32
Although the Naive Bayes model’s assumption that all predictors are independent of
one another is unfeasible in real-world circumstances, this assumption produces a
satisfactory outcome in the majority of instances.
Naive Bayes is often used for text categorization since the dimensionality of the data
is frequently rather large.
Types of Naive Bayes Classifiers
Naive Bayes Classifiers are classified into three categories —
i) Gaussian Naive Bayes
This classifier is employed when the predictor values are continuous and are
expected to follow a Gaussian distribution.
ii) Bernoulli Naive Bayes
When the predictors are boolean in nature and are supposed to follow the Bernoulli
distribution, this classifier is utilized.
iii) Multinomial Naive Bayes
This classifier makes use of a multinomial distribution and is often used to solve
issues involving document or text classification.
PROGRAM:
# Importing library
import math
import random
import csv
# the categorical class names are changed to numberic data

# eg: yes and no encoded to 1 and 0
def encode_class(mydata):
classes = []
for i in range(len(mydata)):
if mydata[i][-1] not in classes:
classes.append(mydata[i][-1])
for i in range(len(classes)):
for j in range(len(mydata)):
if mydata[j][-1] ==
classes[i]: mydata[j][-1] = i
return mydata
# Splitting the data

def splitting(mydata, ratio):
33
train_num = int(len(mydata) * ratio)
train = []
# initially testset will have all the dataset
test = list(mydata)
while len(train) < train_num:
# index generated randomly from range 0
# to length of testset
index = random.randrange(len(test))
# from testset, pop data rows and put it in train
train.append(test.pop(index))
return train, test
# Group the data rows under each class yes or

# no in dictionary eg: dict[yes] and dict[no]
def groupUnderClass(mydata):
dict = {}
for i in range(len(mydata)):
if (mydata[i][-1] not in
dict): dict[mydata[i][-1]] =
[]
dict[mydata[i][-1]].append(mydata[i])
return dict
# Calculating Mean
def mean(numbers):
return sum(numbers) / float(len(numbers))
# Calculating Standard Deviation

def std_dev(numbers):
avg = mean(numbers)
variance = sum([pow(x - avg, 2) for x in numbers]) / float(len(numbers) - 1)
return math.sqrt(variance)
def MeanAndStdDev(mydata):
info = [(mean(attribute), std_dev(attribute)) for attribute in zip(*mydata)]
# eg: list = [ [a, b, c], [m, n, o], [x, y, z]]
# here mean of 1st attribute =(a + m+x), mean of 2nd attribute = (b +
n+y)/3 # delete summaries of last class
del info[-1]
return info
# find Mean and Standard Deviation under each class

def MeanAndStdDevForClass(mydata):
info = {}
dict = groupUnderClass(mydata)
for classValue, instances in dict.items():
info[classValue] = MeanAndStdDev(instances)
return info
34
# Calculate Gaussian Probability Density Function
def calculateGaussianProbability(x, mean, stdev):
expo = math.exp(-(math.pow(x - mean, 2) / (2 * math.pow(stdev, 2))))
return (1 / (math.sqrt(2 * math.pi) * stdev)) * expo
# Calculate Class Probabilities

def calculateClassProbabilities(info, test):
probabilities = {}
for classValue, classSummaries in info.items():
probabilities[classValue] = 1
for i in range(len(classSummaries)):
mean, std_dev =
classSummaries[i] x = test[i]
probabilities[classValue] *= calculateGaussianProbability(x, mean, std_dev)
return probabilities
# Make prediction - highest probability is the

prediction def predict(info, test):probabilities =
calculateClassProbabilities(info, test) bestLabel,
bestProb = None, -1
for classValue, probability in probabilities.items():
if bestLabel is None or probability > bestProb:
bestProb = probability
bestLabel = classValue
return bestLabel
# returns predictions for a set of examples

def getPredictions(info, test):
predictions = []
for i in range(len(test)):
result = predict(info, test[i])
predictions.append(result)
return predictions
# Accuracy score
def accuracy_rate(test,
predictions): correct = 0
for i in range(len(test)):
if test[i][-1] == predictions[i]:
correct += 1
return (correct / float(len(test))) * 100.0
# driver code
# add the data path in your system

filename = r'/content/drive/MyDrive/Dataset/pima-indians-diabetes.csv'
35
# load the file and store it in mydata list
mydata = csv.reader(open(filename, "rt"))
mydata = list(mydata)
mydata =
encode_class(mydata) for i in
range(len(mydata)):
mydata[i] = [float(x) for x in mydata[i]]
# split ratio = 0.7

# 70% of data is training data and 30% is test data used for testing
ratio = 0.7
train_data, test_data = splitting(mydata, ratio)
print('Total number of examples are: ', len(mydata))
print('Out of these, training examples are: ', len(train_data))
print("Test examples are: ", len(test_data))
# prepare model
info = MeanAndStdDevForClass(train_data)
# test model
predictions = getPredictions(info, test_data)
accuracy = accuracy_rate(test_data, predictions)
print("Accuracy of your model is: ", accuracy)
OUTPUT:
Total number of examples are: 768 Out of these, training examples are: 537 Test
examples are: 231 Accuracy of your model is: 74.02597402597402
36
K-Nearest Neighbour algorithm:
This algorithm is used to solve the classification model problems. K-nearest

neighbor or K-NN algorithm basically creates an imaginary boundary to classify the
data. When new data points come in, the algorithm will try to predict that to the
nearest of the boundary line.
Therefore, larger k value means smother curves of separation resulting in less
complex models. Whereas, smaller k value tends to over fit the data and resulting in
complex models.
Note: It’s very important to have the right k-value when analysing the dataset to
avoid over fitting and under fitting of the dataset.
Using the k-nearest neighbour algorithm we fit the historical data (or train the
model) and predict the future.
PROGRAM:
# Import necessary modules
fromsklearn.neighbors importKNeighborsClassifier
fromsklearn.model_selection import train_test_split
fromsklearn.datasets importload_iris
# Loading data
irisData =load_iris()
# Create feature and target arrays
X =irisData.data
y =irisData.target
# Split into training and test set
X_train, X_test, y_train, y_test

=train_test_split( X, y, test_size =0.2,
random_state=42)
knn =KNeighborsClassifier(n_neighbors=7)
knn.fit(X_train, y_train)
37
# Predict on dataset which model has not seen before
print(knn.predict(X_test))
In the example shown above following steps are performed:

1. The k-nearest neighbour algorithm is imported from the scikit-learn package.
2. Create feature and target variables.
3. Split data into training and test data.
4. Generate a k-NN model using neighbors value.
5. Train or fit the data into the model.
6. Predict the future.
We have seen how we can use K-NN algorithm to solve the supervised machine
learning problem. But how to measure the accuracy of the model?
Consider an example shown below where we predicted the performance of the
above model:
fromsklearn.model_selection importtrain_test_split
# Loading data
irisData =load_iris()
X =irisData.data
y =irisData.target

random_state=42)
knn =KNeighborsClassifier(n_neighbors=7)
38
# Calculate the accuracy of the model

print(knn.score(X_test, y_test))
Model Accuracy:
So far so good. But how to decide the right k-value for the dataset? Obviously, we
need to be familiar to data to get the range of expected k-value, but to get the exact
k-value we need to test the model for each and every expected k-value. Refer to the
example shown below.
fromsklearn.model_selection importtrain_test_split
importnumpy as np
importmatplotlib.pyplot as
plt irisData =load_iris()
X =irisData.data
y =irisData.target

random_state=42)
neighbors =np.arange(1, 9)
train_accuracy =np.empty(len(neighbors))
39
test_accuracy =np.empty(len(neighbors))
# Loop over K values
fori, k inenumerate(neighbors):
knn =KNeighborsClassifier(n_neighbors=k)
# Compute training and test data accuracy
train_accuracy[i] =knn.score(X_train,
y_train) test_accuracy[i] =knn.score(X_test,
y_test)
# Generate plot
plt.plot(neighbors, test_accuracy, label ='Testing dataset Accuracy')

plt.plot(neighbors, train_accuracy, label ='Training dataset Accuracy')
plt.legend()
plt.xlabel('n_neighbors')
plt.ylabel('Accuracy')
plt.show()
40
Output:
Here in the example shown above, we are creating a plot to see the k-value for
which we have high accuracy.
Note: This is a technique which is not used industry-wide to choose the correct
value of n_neighbors. Instead, we do hyperparameter tuning to choose the value that
gives the best performance. We will be covering this in future posts.
Support Vector Machine(SVM)

Importing the dataset
import pandas as pd
data = pd.read_csv("apples_and_oranges.csv")
41
Splitting the dataset into training and test samples

training_set, test_set = train_test_split(data, test_size = 0.2, random_state = 1)
Classifying the predictors and target
X_train = training_set.iloc[:,0:2].values
Y_train = training_set.iloc[:,2].values
X_test = test_set.iloc[:,0:2].values
Y_test = test_set.iloc[:,2].values
Initializing Support Vector Machine and fitting the training data
from sklearn.svm import SVC

classifier = SVC(kernel='rbf', random_state = 1)
classifier.fit(X_train,Y_train)
Predicting the classes for test set
Y_pred = classifier.predict(X_test)
Attaching the predictions to test set for comparing
test_set["Predictions"] = Y_pred
Comparing the actual classes and predictions
Let’s have a look at the test_set:
42
Calculating the accuracy of the predictions

cm = confusion_matrix(Y_test,Y_pred)
accuracy = float(cm.diagonal().sum())/len(Y_test) print("\
nAccuracy Of SVM For The Given Dataset : ", accuracy)
Output:
Accuracy Of SVM For The Given Dataset : 0.875
43
AIM: Demonstration of clustering algorithms using
a.K-Means
b. Hierarchical algorithms (agglomerative etc.). Interpret the clusters obtained.
K - means clustering
Aim – To demonstrate k-means clustering algorithm
Objective – Plotting of clusters using k-means clustering algorithm
Theory –
K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled dataset into different clusters.
Here K defines the number of pre-defined clusters that need to be created in the process, as if K=2, there will be two clusters,
and for K=3, there will be three clusters, and so on.
It allows us to cluster the data into different groups and a convenient way to discover the categories of groups in the
unlabeled dataset on its own without the need for any training. It is a centroid-based algorithm, where each cluster is associated
with a centroid. The main aim of this algorithm is to minimize the sum of distances between the data point and their
corresponding clusters.
Code –
pip install -U scikit-learn
#import requird libraries
import numpy as np
from sklearn.datasets import make_blobs
dataset = make_blobs(n_samples=200, centers=4, n_features=2, cluster_std=1.6,
random_state=50)
points = dataset[0]
# Import KMeans
from sklearn.cluster import KMeans
# Create kmeans object
kmeans = KMeans(n_clusters=4)
# fit the kmeans object to dataset
kmeans.fit(points)
plt.scatter(dataset[0][:,0], dataset[0][:,1])
clusters = kmeans.cluster_centers_
#prinrt out the clusters
print(clusters)
y_km = kmeans.fit_predict(points)
y_km
plt.scatter(points[y_km== 1,0], points[y_km == 1,1], s=50, color= 'blue')
plt.scatter(points[y_km== 0,0], points[y_km == 0,1], s=50, color= 'red')
plt.scatter(points[y_km== 2,0], points[y_km == 2,1], s=50, color= 'green')
plt.scatter(points[y_km== 3,0], points[y_km == 3,1], s=50, color= 'yellow')
plt.scatter(clusters[0][0], clusters[0][1], marker='*', s=200, color= 'black')
plt.show()
OUTPUT –
44
Hierarchical Clustering
Aim – To demonstrate hierarchical clustering algorithm
Objective – Plotting of clusters using hierarchical clustering algorithm
Theory –
Hierarchical clustering is another unsupervised machine learning algorithm, which is used to group the
unlabeled datasets into a cluster and known as hierarchical cluster analysis or HCA. In this algorithm, we develop
the hierarchy of clusters in the form of a tree, and this tree shaped structure is known as the dendrogram.
Sometimes the results of K-means clustering and hierarchical clustering may look similar, but they both
differ depending on how they work. As there is no requirement to predetermine the number of clusters as we did in
the K-Means algorithm.
The hierarchical clustering technique has two approaches:
1. Agglomerative: Agglomerative is a bottom-up approach, in which the algorithm starts
with taking all data points as single clusters and merging them until one cluster is left.
2. Divisive: Divisive algorithm is the reverse of the agglomerative algorithm as it is a top-
45
down approach.
46
CODE –
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as
plt from sklearn import
datasets
# Import iris data
iris = datasets.load_iris()
iris_data = pd.DataFrame(iris.data)
iris_data.columns = iris.feature_names
iris_data['flower_type'] = iris.target
iris_data.head()
iris_X = iris_data.iloc[:, [0, 1, 2,3]].values
iris_Y = iris_data.iloc[:,4].values
plt.figure(figsize=(10, 7))
plt.scatter(iris_X[iris_Y == 0, 0], iris_X[iris_Y == 0, 1], s=100,
c='blue', label='Type 1')
c='yellow', label='Type 2')
c='green', label='Type 3')
plt.legend()
plt.show()
import scipy.cluster.hierarchy as sc
# Plot dendrogram
plt.title("Dendrograms")
# Create dendrogram
sc.dendrogram(sc.linkage(iris_X, method='ward'))
plt.title('Dendrogram')
plt.xlabel('Sample index')
plt.ylabel('Euclidean distance')
from sklearn.cluster import AgglomerativeClustering
cluster = AgglomerativeClustering( n_clusters=3, affinity='euclidean',
linkage='ward')
cluster.fit(iris_X)
labels =
cluster.labels_ labels
plt.scatter(iris_X[labels == 0, 0], iris_X[labels == 0, 1], s = 100, c = 'blue', label = 'Type 1')
plt.scatter(iris_X[labels == 1, 0], iris_X[labels == 1, 1], s = 100, c = 'yellow', label = 'Type 2')
plt.scatter(iris_X[labels == 2, 0], iris_X[labels == 2, 1], s = 100, c = 'green', label = 'Type 3')
plt.legend()
47
plt.show()
OUTPUT –
48
AIM: Demonstrate ensemble techniques like boosting, bagging, randomforests etc.
RANDOM FOREST
Implement Random Forest Algorithm using Python.
Description
Dataset: Breast Cancer Wisconsin (Diagnostic) Dataset
Let us have a quick look at the dataset:
49
PROGRAM:
import pandas as pd
df = pd.read_csv("diabetes.csv")
df.head()
df.isnull().sum()
X = df.drop("Outcome",axis="columns")
y = df.Outcome
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
X_scaled[:3]
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, stratify=y, random_state=10)
X_train.shape
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
scores = cross_val_score(RandomForestClassifier(n_estimators=50), X, y, cv=5)
scores.mean()
OUTPUT:
DIFFERNCE BETWEEN NAÏVE BAYESIAN AND DECISION TREE
50
51
BAGGING
Aim – To demonstrate boosting ensemble technique

Objective – To understand the working of boosting algorithm
Theory –
Boosting is an ensemble learning technique that combines multiple weak or base
learners to create a strong predictive model. It is designed to improve the overall accuracy
of the model by sequentially training base learners, where each subsequent learner focuses
on correcting the mistakes made by the previous ones.
The boosting process starts by training an initial base learner on the entire dataset.
Then, it assigns higher weights to the misclassified instances, emphasizing their
importance in subsequent iterations. In each iteration, a new base learner is trained on the
modified dataset to give more attention to the previously misclassified instances.
The most popular boosting algorithm is AdaBoost (Adaptive Boosting), which
adjusts the weights of instances based on their difficulty to classify correctly. Another
commonly used algorithm is Gradient Boosting, which minimizes a loss function by
iteratively adding new base learners.
Boosting has proven to be highly effective in handling complex classification and
regression tasks, often outperforming individual base learners. It is known for its ability to
reduce bias and variance, handle noisy data, and provide robust predictions. However,
boosting can be sensitive to outliers and overfitting.
Code-
import numpy as
np import pandas
as pd import
xgboost
from sklearn.ensemble import AdaBoostClassifier
from sklearn.preprocessing import LabelEncoder
df = pd.read_csv("/content/drive/MyDrive/Dataset/mushrooms.csv")
for col in df.columns:
print('Unique value count of', col, 'is', len(df[col].unique()))
df = df.drop("veil-type", axis=1)
df.head(6)
label_encoder = LabelEncoder()
for column in df.columns:
df[column] = label_encoder.fit_transform(df[column])
X = df.loc[:, df.columns != 'class']
Y = df['class']
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.3, random_state = 100)
scaler = StandardScaler()
52
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
adaboost = AdaBoostClassifier(n_estimators = 50, learning_rate = 0.2).fit(X_train, Y_train)
score = adaboost.score(X_test, Y_test)
print(score)
53
OUTPUT-
AIM: Build a classifier, compare its performance with an ensemble technique like
random forest.
RANDOM FOREST
Random Forest is a popular machine learning algorithm that belongs to the supervised learning
technique. It can be used for both Classification and Regression problems in ML. It is based on the
concept of ensemble learning, which is a process of combining multiple classifiers to solve a complex
problem and to improve the performance of the model. As the name suggests, "Random Forest is a
classifier that contains a number of decision trees on various subsets of the given dataset and takes the
average to improve the predictive accuracy of that dataset." Instead of relying on one decision tree, the
random forest takes the prediction from each tree and based on the majority votes of predictions, and
it predicts the final output. The greater number of trees in the forest leads to higher accuracy and
prevents the problem of overfitting.
Random Forest works in two-phase first is to create the random forest by combining N decision
tree, and second is to make predictions for each tree created in the first phase.
The Working process can be explained in the below steps and

diagram: Step-1: Select random K data points from the training set.
Step-2: Build the decision trees associated with the selected data points
(Subsets). Step-3: Choose the number N for decision trees that you want to build.
Step-4: Repeat Step 1 & 2.
Step-5: For new data points, find the predictions of each decision tree, and assign the new data points
54
to the category that wins the majority votes.
Code-
#DECISION TREE
55
import pandas as pd
df = pd.read_csv("/content/drive/MyDrive/Dataset/diabetes.csv")
df.head()
y = df.Outcome
from sklearn.preprocessing import
StandardScaler scaler = StandardScaler()
X_scaled[:3]
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, stratify=y,
random_state=10)
X_train.shape
X_test.shape
from sklearn.tree import DecisionTreeClassifier
scores = cross_val_score(DecisionTreeClassifier(), X, y, cv=5)
scores
scores.mean()
#RANDOM FOREST
import pandas as pd
df = pd.read_csv("/content/drive/MyDrive/Dataset/diabetes.csv")
df.head()
df.isnull().sum()
y = df.Outcome
from sklearn.preprocessing import
StandardScaler scaler = StandardScaler()
X_scaled[:3]
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, stratify=y,
random_state=10)
X_train.shape
from sklearn.ensemble import RandomForestClassifier
scores = cross_val_score(RandomForestClassifier(n_estimators=50), X, y, cv=5)
scores.mean()
Output –
56
57
AIM: Evaluate various classification algorithms performance on a dataset using
various measures like True Positive Rate, False Positive Rate, Precision, Recall etc.
The most commonly used Performance metrics for classification problem are as
follows,
 Accuracy
 Confusion Matrix
 Precision, Recall, and F1 score
 ROC AUC
 Log-loss
Accuracy
Accuracy is the simple ratio between the number of correctly classified points to the
total number of points.
To calculate accuracy, scikit-learn provides a utility function.

from sklearn.metrics import accuracy_score#predicted y values
y_pred = [0, 2, 1, 3]#actual y values
y_true = [0, 1, 2, 3]accuracy_score(y_true, y_pred)
0.5
Confusion Matrix
Confusion Matrix is a summary of predicted results in specific table layout that

allows visualization of the performance measure of the machine learning model for a
binary
58
classification problem (2 classes) or multi-class classification problem (more than 2
59
classes)
TP means True Positive. It can be interpreted as the model predicted positive class
and it is True.
FP means False Positive. It can be interpreted as the model predicted positive class
but it is False.
FN means False Negative. It can be interpreted as the model predicted negative class
but it is False.
TN means True Negative. It can be interpreted as the model predicted negative class
and it is True.
calculate confusion matrix, sklearn provides a utility function

y_true = [2, 0, 2, 2, 0, 1]
y_pred = [0, 0, 2, 2, 0, 2]
confusion_matrix(y_true, y_pred)
array([[2, 0, 0],
[0, 0, 1],
[1, 0, 2]])
Precision, Recall, and F-1 Score
Precision is the fraction of the correctly classified instances from the total classified
instances. Recall is the fraction of the correctly classified instances from the total
classified instances. Precision and recall are given as follows,
Mathematical formula of Precision and Recall using the confusion matrix
For example, consider that a search query results in 30 pages, out of which 20 are
relevant. And the results fail to display 40 other relevant results. So the precision is
60
20/30 and recall is 20/60.
Precision helps us understand how useful the results are. Recall helps us understand
how complete the results are.
But to reduce the checking of pockets twice, the F1 score is used. F1 score is the
harmonic mean of precision and recall. It is given as,
# Use the numpy library.import numpy as np

# These are the labels we predicted.
pred_labels = np.asarray([0,1,1,0,1,0,0])print 'pred labels:\t\t', pred_labels
# These are the true labels.
true_labels = np.asarray([0,0,1,0,0,1,0])print 'true labels:\t\t',
true_labels pred labels: [0 1 1 0 1 0 0]
true labels: [0 0 1 0 0 1 0]
Here’s the quick way to compute true/false positives and true/false negatives.
Basically we will,
find the predicted and true labels that are assigned to some specific class
use the “AND” operator to combine the results into a single binary vector
sum over the binary vector to count how many incidences there are
61
# True Positive (TP): we predict a label of 1 (positive), and the true label is 1.
TP = np.sum(np.logical_and(pred_labels == 1, true_labels == 1))
# True Negative (TN): we predict a label of 0 (negative), and the true label is
0. TN = np.sum(np.logical_and(pred_labels == 0, true_labels == 0))
# False Positive (FP): we predict a label of 1 (positive), but the true label is 0.
FP = np.sum(np.logical_and(pred_labels == 1, true_labels == 0))
# False Negative (FN): we predict a label of 0 (negative), but the true label is
1. FN = np.sum(np.logical_and(pred_labels == 0, true_labels == 1))
print 'TP: %i, FP: %i, TN: %i, FN: %i' % (TP,FP,TN,FN)
CODE –
import numpy as np
from sklearn.datasets import make_classification
from sklearn.metrics import roc_curve
X, label = make_classification(n_samples=500, n_classes=2, weights=[1,1], random_state=100)
X_train, X_test, y_train, y_test = train_test_split(X, label, test_size=0.3, random_state=1)
model = LogisticRegression()
model.fit(X_train, y_train)
probs = model.predict_proba(X_test)
probs = probs[:, 1]
fpr, tpr, thresholds = roc_curve(y_test, probs)
plt.figure(figsize = (10,6))
plt.plot(fpr, tpr, color='red', label='ROC')
plt.plot([0, 1], [0, 1], color='darkblue',
linestyle='--') plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic Curve')
62
plt.legend()
plt.show()
from sklearn import datasets
import pandas as pd
from sklearn.metrics import precision_recall_curve
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
data = datasets.load_breast_cancer()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target
X_train, X_test, y_train, y_test = train_test_split(df.iloc[:,:-1], df.iloc[:,-1], test_size=0.3,
random_state=42)
model = LogisticRegression()
model.fit(X_train, y_train)
pred = model.predict(X_test)
precision = precision_score(y_test,
pred) recall = recall_score(y_test, pred)
print('Precision: ',precision)
print('Recall: ',recall)
OUTPUT:
63
64
AIM: Demonstrate GA for optimization (Minimization or Maximization problem).
Genetic Algorithm (GA) is a search-based optimization technique based on the

principles of Genetics and Natural Selection. It is frequently used to find optimal or
near-optimal solutions to difficult problems which otherwise would take a lifetime to
solve. It is frequently used to solve optimization problems, in research, and in
machine learning.
The GA algorithm has the following step:
 Initialize a population N
 Selection: selection of the individuals to evolve from
 Crossover: crossover of couples of parents
 Mutation: random mutations of the individuals
PROGRAM:
import random
def fitness_function(x):
return x**2 - 4*x
def generate_individual():
return random.uniform(-10, 10)
def generate_population(population_size):
return [generate_individual() for _ in range(population_size)]
def evaluate_population(population):
return [fitness_function(individual) for individual in population]
def select_parents(population, fitness_values):
# Using roulette wheel selection
total_fitness = sum(fitness_values)
probabilities = [fitness / total_fitness for fitness in fitness_values]
return random.choices(population, probabilities, k=2)
def crossover(parent1, parent2):
# Using simple average crossover
offspring1 = (parent1 + parent2) / 2.0
offspring2 = (parent1 + parent2) / 2.0
65
return offspring1, offspring2
def mutate(individual, mutation_rate):
# Perturb individual's solution by adding a small random value
66
if random.random() < mutation_rate:
individual += random.uniform(-1, 1)
return individual
def replace_population(population, offspring):
population_size = len(population)
return population[:population_size - len(offspring)] + offspring
def genetic_algorithm(population_size, num_generations):
population = generate_population(population_size)
for _ in range(num_generations):
fitness_values = evaluate_population(population)
parents = select_parents(population, fitness_values)
offspring = []
while len(offspring) < population_size:
parent1, parent2 = parents
child1, child2 = crossover(parent1,
parent2) child1 = mutate(child1,
mutation_rate=0.1) child2 = mutate(child2,
mutation_rate=0.1)
offspring.extend([child1, child2])
population = replace_population(population, offspring)
# Find the best individual in the final population
fitness_values = evaluate_population(population)
best_individual = population[fitness_values.index(max(fitness_values))]
return best_individual
# Example usage
population_size = 100
num_generations = 50
best_solution = genetic_algorithm(population_size, num_generations)
best_fitness = fitness_function(best_solution)
print("Best solution:", best_solution)
print("Best fitness:", best_fitness)
Output
67
AIM: Case study on supervised/unsupervised learning algorithm
In the real world, we are surrounded by humans who can learn everything from
their experiences with their learning capability, and we have computers or machines
which work on our instructions. But can a machine also learn from experiences or past
data like a human does? So here comes the role of Machine Learning.
Machine Learning is said as a subset of artificial intelligence that is mainly concerned

with the development of algorithms which allow a computer to learn from the data and
past experiences on their own. The term machine learning was first introduced by Arthur
Samuel in 1959. We can define it in a summarized way as:
Machine learning enables a machine to automatically learn from data, improve

performance from experiences, and predict things without being explicitly
programmed.
With the help of sample historical data, which is known as training data, machine
learning algorithms build a mathematical model that helps in making predictions or
decisions without being explicitly programmed. Machine learning brings computer science
and statistics together for creating predictive models. Machine learning constructs or uses
the algorithms that learn from historical data. The more we will provide the information,
the higher will be the performance.
68
A machine has the ability to learn if it can improve its performance by gaining more
data.
69
How does Machine Learning work
A Machine Learning system learns from historical data, builds the prediction models,
and whenever it receives new data, predicts the output for it. The accuracy of
predicted output depends upon the amount of data, as the huge amount of data helps to
build a better model which predicts the output more accurately.
Suppose we have a complex problem, where we need to perform some predictions, so

instead of writing a code for it, we just need to feed the data to generic algorithms, and
with the help of these algorithms, machine builds the logic as per the data and predict the
output. Machine learning has changed our way of thinking about the problem. The below
block diagram explains the working of Machine Learning algorithm:
Features of Machine Learning:

o Machine learning uses data to detect various patterns in a given dataset.
o It can learn from past data and improve automatically.
o It is a data-driven technology.
o Machine learning is much similar to data mining as it also deals with the huge
amount of the data.
Need for Machine Learning
The need for machine learning is increasing day by day. The reason behind the need for
machine learning is that it is capable of doing tasks that are too complex for a person to
implement directly. As a human, we have some limitations as we cannot access the huge
amount of data manually, so for this, we need some computer systems and here comes the
machine learning to make things easy for us.
We can train machine learning algorithms by providing them the huge amount of data and
let them explore the data, construct the models, and predict the required output
automatically. The performance of the machine learning algorithm depends on the amount
of data, and it can be determined by the cost function. With the help of machine learning,
we can save both time and money.
70
The importance of machine learning can be easily understood by its uses cases, Currently,
machine learning is used in self-driving cars, cyber fraud detection, face recognition,
and friend suggestion by Facebook, etc. Various top companies such as Netflix and
71
Amazon have build machine learning models that are using a vast amount of data to
analyze the user interest and recommend product accordingly.
Following are some key points which show the importance of Machine Learning:
o Rapid increment in the production of data

o Solving complex problems, which are difficult for a human
o Decision making in various sector including finance
o Finding hidden patterns and extracting useful information from

data. Classification of Machine Learning
At a broad level, machine learning can be classified into three types:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
1) Supervised Learning
Supervised learning is a type of machine learning method in which we provide sample

labeled data to the machine learning system in order to train it, and on that basis, it
predicts the output.
The system creates a model using labeled data to understand the datasets and learn about
each data, once the training and processing are done then we test the model by providing a
sample data to check whether it is predicting the exact output or not.
72
The goal of supervised learning is to map input data with the output data. The supervised
learning is based on supervision, and it is the same as when a student learns things in the
supervision of the teacher. The example of supervised learning is spam filtering.
Supervised learning can be grouped further in two categories of algorithms:
o Classification
o Regression
2) Unsupervised Learning
Unsupervised learning is a learning method in which a machine learns without any

supervision.
The training is provided to the machine with the set of data that has not been labeled,
classified, or categorized, and the algorithm needs to act on that data without any
supervision. The goal of unsupervised learning is to restructure the input data into new
features or a group of objects with similar patterns.
In unsupervised learning, we don't have a predetermined result. The machine tries to find
useful insights from the huge amount of data. It can be further classifieds into two
categories of algorithms:
o Clustering
o Association
3) Reinforcement Learning
Reinforcement learning is a feedback-based learning method, in which a learning agent

gets a reward for each right action and gets a penalty for each wrong action. The agent
learns automatically with these feedbacks and improves its performance. In reinforcement
learning, the agent interacts with the environment and explores it. The goal of an agent is
to get the most reward points, and hence, it improves its performance.
The robotic dog, which automatically learns the movement of his arms, is an example of
Reinforcement learning.
History of Machine Learning
Before some years (about 40-50 years), machine learning was science fiction, but today it
is the part of our daily life. Machine learning is making our day to day life easy from self-
driving cars to Amazon virtual assistant "Alexa". However, the idea behind machine
73
learning is so old and has a long history. Below some milestones are given which have
occurred in the history of machine learning:
74
The early history of Machine Learning (Pre-1940):
o 1834: In 1834, Charles Babbage, the father of the computer, conceived a device that
could be programmed with punch cards. However, the machine was never built, but
all modern computers rely on its logical structure.
o 1936: In 1936, Alan Turing gave a theory that how a machine can determine and
execute a set of instructions.
The era of stored program computers:

o 1940: In 1940, the first manually operated computer, "ENIAC" was invented, which
was the first electronic general-purpose computer. After that stored program
computer such as EDSAC in 1949 and EDVAC in 1951 were invented.
o 1943: In 1943, a human neural network was modeled with an electrical circuit. In
1950, the scientists started applying their idea to work and analyzed how human
neurons might work.
Computer machinery and intelligence:

o 1950: In 1950, Alan Turing published a seminal paper, "Computer Machinery and
Intelligence," on the topic of artificial intelligence. In his paper, he asked, "Can
machines think?"
Machine intelligence in Games:

o 1952: Arthur Samuel, who was the pioneer of machine learning, created a program
that helped an IBM computer to play a checkers game. It performed better more it
75
played.
o 1959: In 1959, the term "Machine Learning" was first coined by Arthur Samuel.
76
The first "AI" winter:
o The duration of 1974 to 1980 was the tough time for AI and ML researchers, and
this duration was called as AI winter.
o In this duration, failure of machine translation occurred, and people had reduced
their interest from AI, which led to reduced funding by the government to the
researches.
Applications of Machine learning
Machine learning is a buzzword for today's technology, and it is growing very rapidly day
by day. We are using machine learning in our daily life even without knowing it such as
Google Maps, Google assistant, Alexa, etc. Below are some most trending real-world
applications of Machine Learning:
1. Image Recognition:
Image recognition is one of the most common applications of machine learning. It is used
to identify objects, persons, places, digital images, etc. The popular use case of image
recognition and face detection is, Automatic friend tagging suggestion:
Facebook provides us a feature of auto friend tagging suggestion. Whenever we upload a

photo with our Facebook friends, then we automatically get a tagging suggestion with
name, and the technology behind this is machine learning's face detection and recognition
algorithm.
It is based on the Facebook project named "Deep Face," which is responsible for face
recognition and person identification in the picture.
Backward Skip 10sPlay VideoForward Skip 10s
2. Speech Recognition
While using Google, we get an option of "Search by voice," it comes under speech
recognition, and it's a popular application of machine learning.
Speech recognition is a process of converting voice instructions into text, and it is also
known as "Speech to text", or "Computer speech recognition." At present, machine
learning algorithms are widely used by various applications of speech recognition. Google
assistant, Siri, Cortana, and Alexa are using speech recognition technology to follow the
voice instructions.
77
3. Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us the correct
78
path with the shortest route and predicts the traffic conditions.
It predicts the traffic conditions such as whether traffic is cleared, slow-moving, or heavily
congested with the help of two ways:
Real Time location of the vehicle form Google Map app and sensors
Average time has taken on past days at the same time.
Everyone who is using Google Map is helping this app to make it better. It takes
information from the user and sends back to its database to improve the performance.
4. Product recommendations:
Machine learning is widely used by various e-commerce and entertainment companies

such as Amazon, Netflix
79

AI Lab Manual For V 5SEM PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AI Lab Manual For V 5SEM PDF

Uploaded by

Copyright:

Available Formats

DEPARTMENT OF COMPUTERSCIENCE AND

ARTIFICIAL INTELLIGENCE (CS&AI)

ARTIFICIAL INTELLIGENCE LAB MANUAL

The course should enable the students:

At the end of the course, student will be able to:

Recommended System/ SoftwareRequirements:

 IntelbaseddesktopPCwithminimumof2.6GHZorfaster processor with

Experiment Name of the Experiments Page No.

Write a program to implement Uninformed search techniques:

Programs for search techniques 3

Write a program to implement Informed search techniques.

2 2 a. a. Greedy Best first search.

a. Write simple facts for the statements and querying it.

b. Write a program for Family-tree.

b. Multi-layer Feed Forward neural network.

Text processing using NLTK

# Define a graph as an adjacency list

'A': ['B', 'C'],

'B': ['A', 'D', 'E'],

'C': ['A', 'F'],

'E': ['B', 'F'],

'F': ['C', 'E']

def bfs(graph, start):

visited = set() # Set to keep track of visited nodes

queue = deque([start]) # Initialize the queue with the start node

vertex = queue.popleft() # Get the next vertex from the queue

if vertex not in visited:

print(vertex) # Process the current vertex

# Add the unvisited neighbors to the queue

for neighbor in graph[vertex]:

if neighbor not in visited:

# Start BFS from vertex 'A'

a.Greedy Best First Search:

Rules: likes(X, Y) :- loves(X, Y).

Facts with Arguments Examples

An example database.It details who eats what in some world model.

eats(fred,t_bone_steaks). /* "Fred eats T-bone steaks" */

eats(tony,apples). /* "Tony eats apples" */

eats(john,apples). /* "John eats apples" */

eats(john,grapefruit). /* "John eats grapefruit" */

If we now ask some queries we would get the following interaction:

?- eats(fred,oranges). /* does this match anything in the database? */

yes /* yes, matches the first clause in the database */

?- eats(john,apples). /* do we have a fact that says john eats apples? */

yes /* yes we do, clause 4 of our eats database */

?- eats(mike,apples). /* how about this query, does mike eat apples */

no /* not according to the above database. */

no /* again no, we don't know whether fred eats apples */

Facts and queries

b. Write a Program For Family Tree.

AIM: A simple example of a family tree program in Prolog.

 male/1 and female/1 are used to define the gender of individuals.

father(X, Y) :- male(X), parent(X, Y).

mother(X, Y) :- female(X), parent(X, Y).

sibling(X, Y) :- parent(Z, X), parent(Z, Y), X \= Y.

grandparent(X, Z) :- parent(X, Y), parent(Y, Z).

The program consist of:

AIM: Programs for classification

a. Build models suing linear regression and logistic

b. Write a program to demonstrate the following

Build models using linear regression and logistic regression

The term "linearity" in algebra refers to a linear relationship between

Let's consider a scenario where we want to determine the

Entropy (S) = (-p/p+n)log2 (p/p+n) - (n/n+p)log2 ((n/n+p))