Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 33

Marathwada Institute of Technology, Bulandshahr (U.P.

DEPARTMENT OF MASTER OF COMPUTER APPLICATIONS

SUBJECT NOTES

KMC101 : ARTIFICIAL INTELLIGENCE

(UNIT-2) :- Data & Algorithms , History Of Data , Data Storage And Importance of Data and its Acquisition ,
The Stages of data processing , Data Visualization , Regression, Prediction & Classification ,Clustering &
Recommender Systems.

B.TECH-I/KMC 101/AI/U-1/N-1/ Prof. Rashmi Bhardwaj

:- A history of data
Data is part of the fabric of life and society — and has been for a long time. The history
of data is a long story detailing the evolution of data collection, storage and processing.
It's said that knowledge is power.

When was data invented?


1640s
The first English use of the word "data" is from the 1640s. The word "data" was first
used to mean "transmissible / Expanding and storable computer information" in 1946.
The expression "data processing" was first used in 1954. The Latin word data is the
plural of datum, "(thing) given," neuter past participle of dare "to give".

What exactly is data?


Mobile data is what allows your phone to get online when you're away from Wi-Fi.
Mobile-enabled devices can send and receive information over a wireless cellular
connection. ... We'll also go over some tips for using less data on your iPhone
or Android device.

What is the importance of historical data?


Historical data enables the tracking ofimprovement / Non-productivity )over time which
gives key insights. These insights are essential for driving a business. Marketers are
always on the run to better understand and segment the customers. Keeping historical
data can help marketers understand iftheir customer segment is changing.
TYPE OF DATA :-

1.Personal data
Personal data is anything that is specific to you. It covers your demographics, your
location, your email address and other identifying factors.

2. Transactional data
Transactional data is anything that requires an action to collect. You might click
on an ad, make a purchase, visit a certain web page, etc.

3.Web data
Web data is a collective term which refers to any type of data you might pull from
the internet, whether to study for research purposes or otherwise. That might be
data on what your competitors are selling, published government data, football
scores, etc.

4.Sensor data
Sensor data is produced by objects and is often referred to as the Internet of Things.
It covers everything from your smartwatch measuring your heart rate to a building
with external sensors that measure the weather.

The importance of data collection


Data collection differs from data mining in that it is a process by which data is
gathered and measured. All this must be done before high quality research can
begin and answers to lingering questions can be found. Data collection is usually
done with software, and there are many different data collection procedures,
strategies, and techniques. Most data collection is centered on electronic data, and
since this type of data collection encompasses so much information, it usually
crosses into the realm of big data.

So why is data collection important? It is through data collection that a business or


management has the quality information they need to make informed decisions
from further analysis, study, and research. Without data collection, companies
would stumble around in the dark using outdated methods to make their decisions.
Data collection instead allows them to stay on top of trends, provide answers to
problems, and analyze new insights to great effect.
Data Acquisition Systems :-
The systems, used for data acquisition are known as data acquisition systems. These
data acquisition systems will perform the tasks such as conversion of data, storage of
data, transmission of data and processing of data.
Data acquisition systems consider the following analog signals.
 Analog signals, which are obtained from the direct measurement of electrical
quantities such as DC & AC voltages, DC & AC currents, resistance and etc.
 Analog signals, which are obtained from transducers such as LVDT,
Thermocouple & etc.

Types of Data Acquisition Systems


Data acquisition systems can be classified into the following two types.

 Analog Data Acquisition Systems


 Digital Data Acquisition Systems
Now, let us discuss about these two types of data acquisition systems one by one.

Analog Data Acquisition Systems

The data acquisition systems, which can be operated with analog signals are known
as analog data acquisition systems. Following are the blocks of analog data
acquisition systems.
 Transducer − It converts physical quantities into electrical signals.
 Signal conditioner − It performs the functions like amplification and selection of
desired portion of the signal.
 Display device − It displays the input signals for monitoring purpose.
 Graphic recording instruments − These can be used to make the record of input
data permanently.
 Magnetic tape instrumentation − It is used for acquiring, storing & reproducing of
input data.

Digital Data Acquisition Systems

The data acquisition systems, which can be operated with digital signals are known
as digital data acquisition systems. So, they use digital components for storing or
displaying the information.
Mainly, the following operations take place in digital data acquisition.

 Acquisition of analog signals


 Conversion of analog signals into digital signals or digital data
 Processing of digital signals or digital data
Following are the blocks of Digital data acquisition systems.
 Transducer − It converts physical quantities into electrical signals.
 Signal conditioner − It performs the functions like amplification and selection of
desired portion of the signal.
 Multiplexer − connects one of the multiple inputs to output. So, it acts as parallel
to serial converter.
 Analog to Digital Converter − It converts the analog input into its equivalent
digital output.
 Display device − It displays the data in digital format.
 Digital Recorder − It is used to record the data in digital format.
Data acquisition systems are being used in various applications such as biomedical
and aerospace. So, we can choose either analog data acquisition systems or digital
data acquisition systems based on the requirement.
DATA & ALGORITHMS.:-

Artificial Intelligence has grown to have a significant impact on the world.


With large amounts of data being generated by different applications and
sources, machine learning systems can learn from the test data and perform
intelligent tasks.
Artificial Intelligence is the field of computer science that deals with imparting
the decisive ability and thinking the ability to machines. Artificial Intelligence
is thus a blend of computer science, data analytics, and pure mathematics.
Machine learning becomes an integral part of Artificial Intelligence, and it
only deals with the first part, the process of learning from input
data. Artificial Intelligence and its benefits have never ceased to amaze us.

Table of Contents
 Types of Artificial Intelligence Algorithms.
o 1. Classification Algorithms.
 a) Naive Bayes.
 b) Decision Tree.
 c) Random Forest.
 d) Support Vector Machines.
 e) K Nearest Neighbours.
o 2. Regression Algorithms.
 a) Linear regression.
 b) Lasso Regression.
 c) Logistic Regression.
 d) Multivariate Regression.
 e) Multiple Regression Algorithm.
o 3. Clustering Algorithms.
 a) K-Means Clustering.
 b) Fuzzy C-means Algorithm.
 c) Expectation-Maximisation (EM) Algorithm.
 d) Hierarchical Clustering Algorithm.
 Let’s wind up and conclude.

Types of Artificial Intelligence Algorithms


Artificial intelligence algorithms can be broadly classified as :
1. Classification Algorithms
Classification algorithms are part of supervised learning. These
algorithms are used to divide the subjected variable into different classes
and then predict the class for a given input. For example, classification
algorithms can be used to classify emails as spam or not. Let’s discuss
some of the commonly used classification algorithms.
a) Naive Bayes
Naive Bayes algorithm works on Bayes theorem and takes a probabilistic
approach, unlike other classification algorithms. The algorithm has a set
of prior probabilities for each class. Once data is fed, the algorithm
updates these probabilities to form something known as posterior
probability. This comes useful when you need to predict whether the
input belongs to a given list of classes or not.
b) Decision Tree
The decision tree algorithm is more of a flowchart like an algorithm
where nodes represent the test on an input attribute and branches
represent the outcome of the test.
c) Random Forest
Random forest works like a group of trees. The input data set is
subdivided and fed into different decision trees. The average of outputs
from all decision trees is considered. Random forests offer a more
accurate classifier as compared to Decision tree algorithm.
d) Support Vector Machines
SVM is an algorithm that classifies data using a hyperplane, making sure
that the distance between the hyperplane and support vectors is
maximum.
e) K Nearest Neighbours
KNN algorithm uses a bunch of data points segregated into classes to
predict the class of a new sample data point. It is called “lazy learning
algorithm” as it is relatively short as compared to other algorithms.
2. Regression Algorithms
Regression algorithms are a popular algorithm under supervised machine
learning algorithms. Regression algorithms can predict the output values
based on input data points fed in the learning system. The main
application of regression algorithms includes predicting stock market
price, predicting weather, etc. The most common algorithms under this
section are
a) Linear regression
It is used to measure genuine qualities by considering the consistent
variables. It is the simplest of all regression algorithms but can be
implemented only in cases of linear relationship or a linearly separable
problem. The algorithm draws a straight line between data points called
the best-fit line or regression line and is used to predict new values.
Read: Linear Regression – ML Interview Questions & Answers
b) Lasso Regression
Lasso regression algorithm works by obtaining the subset of predictors
that minimizes prediction error for a response variable. This is achieved
by imposing a constraint on data points and allowing some of them to
shrink to zero value.
c) Logistic Regression
Logistic regression is mainly used for binary classification. This method
allows you to analyze a set of variables and predict a categorical outcome.
Its primary applications include predicting customer lifetime value, house
values, etc
d) Multivariate Regression
This algorithm has to be used when there is more than one predictor
variable. This algorithm is extensively used in retail sector product
recommendation engines, where customers preferred products will
depend on multiple factors like brand, quality, price, review etc.
e) Multiple Regression Algorithm
Multiple Regression Algorithm uses a combination of linear regression
and non-linear regression algorithms taking multiple explanatory
variables as inputs. The main applications include social science research,
insurance claim genuineness, behavioural analysis, etc.
3. Clustering Algorithms
Clustering is the process of segregating and organizing the data points
into groups based on similarities within members of the group. This is
part of unsupervised learning. The main aim is to group similar items.
For example, it can arrange all transactions of fraudulent nature together
based on some properties in the transaction. Below are the most common
clustering algorithms.
a) K-Means Clustering
It is the simplest unsupervised learning algorithm. The algorithm gathers
similar data points together and then binds them together into a cluster.
The clustering is done by calculating the centroid of the group of data
points and then evaluating the distance of each data point from the
centroid of the cluster. Based on the distance, the analyzed data point is
then assigned to the closest cluster. ‘K’ in K-means stands for the number
of clusters the data points are being grouped into.
b) Fuzzy C-means Algorithm
FCM algorithm works on probability. Each data point is considered to
have a probability of belonging to another cluster. Data points don’t have
an absolute membership over a particular cluster, and this is why the
algorithm is called fuzzy.
c) Expectation-Maximisation (EM) Algorithm
It is based on Gaussian distribution we learned in statistics. Data is
pictured into a Gaussian distribution model to solve the problem. After
assigning a probability, a point sample is calculated based on expectation
and maximization equations.
d) Hierarchical Clustering Algorithm
These algorithms sort clusters hierarchical order after learning the data
points and making similarity observations. It can be of two types
 Divisive clustering, for a top-down approach
 Agglomerative clustering, for a bottom-up approach.
Algorithms :-
Introduction
Artificial intelligence (AI) is the intelligence of machines and the branch of computer
science that aims to create it. AI textbooks define the field as "the study and design of
intelligent agents"[1]where an intelligent agent is a system that perceives its environment
and takes actions that maximize its chances of success.[2] John McCarthy, who coined the
term in 1955,[3] defines it as "the science and engineering of making intelligent
machines."[4].

AI research is highly technical and specialized, deeply divided into subfields that often fail
to communicate with each other.[5] Some of the division is due to social and cultural
factors: subfields have grown up around particular institutions and the work of individual
researchers. AI research is also divided by several technical issues. There are subfields
which are focussed on the solution of specific problems, on one of several possible
approaches, on the use of widely differing tools and towards the accomplishment of
particular applications. The central problems of AI include such traits as
reasoning,knowledge, planning, learning, communication, perception and the ability to
move and manipulate objects.[6] General intelligence (or "strong AI") is still among the
field's long term goals.[7] Currently popular approaches include statistical methods,
computational intelligence and traditional symbolic AI.

II. Genetic Algorithm


This article describes how to solve a logic problem using a Genetic Algorithm. It assumes no prior
knowledge of GAs.A genetic algorithm is a search technique used in computing, to find true or approximate
solutions to optimization and search problems, and is often abbreviated as GA. Genetic algorithms are
categorized as global search heuristics.
Genetic algorithms are a particular class of evolutionary algorithms that use techniques inspired by
evolutionary biology such as inheritance, mutation, selection, and crossover(also called recombination).
Genetic algorithms are implemented as a computer simulation in which a population of abstract
representations (called chromosomes or the genotype or the genome) of candidate solutions (called
individuals, creatures, or phenotypes) to an optimization problem evolves towards better solutions.
Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible.

In computer science and operations research, a genetic algorithm is a metaheuristic inspired by the
process of natural selection that belongs to the larger class of evolutionary algorithms.

A genetic algorithm is a search heuristic that is inspired by Charles Darwin’s


theory of natural evolution. This algorithm reflects the process of natural
selection where the fittest individuals are selected for reproduction in order to
produce offspring of the next generation.
Notion of Natural Selection

The process of natural selection starts with the selection of fittest individuals from
a population. They produce offspring which inherit the characteristics of the
parents and will be added to the next generation. If parents have better fitness,
their offspring will be better than parents and have a better chance at surviving.
This process keeps on iterating and at the end, a generation with the fittest
individuals will be found.

This notion can be applied for a search problem. We consider a set of solutions for
a problem and select the set of best ones out of them.

Five phases are considered in a genetic algorithm.

1. Initial population

2. Fitness function

3. Selection

4. Crossover

5. Mutation

Initial Population

The process begins with a set of individuals which is called a Population. Each
individual is a solution to the problem you want to solve.

An individual is characterized by a set of parameters (variables) known as Genes.


Genes are joined into a string to form a Chromosome (solution).
In a genetic algorithm, the set of genes of an individual is represented using a
string, in terms of an alphabet. Usually, binary values are used (string of 1s and
0s). We say that we encode the genes in a chromosome.

Population, Chromosomes and Genes

Fitness Function

The fitness function determines how fit an individual is (the ability of an


individual to compete with other individuals). It gives a fitness score to each
individual. The probability that an individual will be selected for reproduction is
based on its fitness score.

Selection

The idea of selection phase is to select the fittest individuals and let them pass
their genes to the next generation.

Two pairs of individuals (parents) are selected based on their fitness scores.
Individuals with high fitness have more chance to be selected for reproduction.

Crossover

Crossover is the most significant phase in a genetic algorithm. For each pair of
parents to be mated, a crossover point is chosen at random from within the
genes.For example, consider the crossover point to be 3 as shown below.

Crossover point
Offspring are created by exchanging the genes of parents among themselves until
the crossover point is reached.

Exchanging genes among parentsThe new offspring are added to the population.

New offspring

Mutation

In certain new offspring formed, some of their genes can be subjected to


a mutation with a low random probability. This implies that some of the bits in the
bit string can be flipped.

Mutation: Before and After

Mutation occurs to maintain diversity within the population and prevent


premature convergence.
Termination

The algorithm terminates if the population has converged (does not produce
offspring which are significantly different from the previous generation). Then it
is said that the genetic algorithm has provided a set of solutions to our problem.

Path finding algorithms


Pathfinding algorithms are usually an attempt to solve the shortest path problem in
graph theory. They try to find the best path given a starting point and ending point
based on some predefined criteria.

The algorithms we shall discuss in this tutorial are the decisionbased algorithms as well as the simplest of the
recursive type algorithms. This is because they are easy to comprehend as well aseasy toimplement. We will
ignore BFS, Dijkstraand A* since these are very complex algorithms which may require special data structures
and are harder to implement. Path-finding consumes a significant amount of resources, especially in
movement-intensive games such as (massively) multiplayer games.
We investigate several path-finding techniques, and explore the impact on performance of workloads derived
from real player movement sin a multiplayer game. We find that a mapconforming, hierarchical path-finding
strategy performs best, and in combination with caching optimizations can greatly re-duce path-finding cost.
Performance is dominated primarily by algorithm, and only to a lesser degree byworkload variation.
Understanding the real impact of path-finding techniques allows for refined testing and optimization of game
design.
Heuristic Function
: A heuristic is a technique that improves the efficiency of search process, possibly by sacrificing claims of
completeness. While the almost perfect heuristic is significant for theoretical analysis, it is not common to
find such a heuristic in practice.heuristics play a major role in search strategies because of exponential nature
of the most problems.
Heuristic help to reduce the number of alternatives from an exponentialnumberto polynomial number.
Heuristic search has been widely used in both deterministic and probabilistic planning. Yet, although
bidirectional heuristic search has been applied broadly in deterministic planning problems, its function in the
probabilistic domains is only sparsely studied.
A heuristic function is a function that maps from problem state descriptions to measures of desirability,
usually represented as number. Heuristic functions generally have different errors in different states. Heuristic
functions play a crucial rule in optimal planning, and the theoretical limitations of algorithms using such
functions are therefore of interest. Much work has focused on finding bounds on the behavior of heuristic
search algorithms, using heuristics with specific attributes.

“A heuristic function, also called simply a heuristic, is a function that ranks alternatives in
search algorithms at each branching step based on available information to decide which
branch to follow. For example, it may approximate the exact solution.”
Breadth First Search (BFS) Algorithm

BFS stands for Breadth First Search is a vertex based technique for finding a
shortest path in graph. It uses a Queue data structure which follows first in first
out. In BFS, one vertex is selected at a time when it is visited and marked then
its adjacent are visited and stored in the queue. It is slower than DFS.

Breadth first search is a graph traversal algorithm that starts traversing the graph
from root node and explores all the neighbouring nodes. Then, it selects the nearest
node and explore all the unexplored nodes. The algorithm follows the same process
for each of the nearest node until it finds the goal.

The algorithm of breadth first search is given below. The algorithm starts with
examining the node A and all of its neighbours. In the next step, the neighbours of
the nearest node of A are explored and process continues in the further steps. The
algorithm explores all neighbours of all the nodes and ensures that each node is
visited exactly once and no node is visited twice.

Algorithm
o Step 1: SET STATUS = 1 (ready state)
for each node in G
o Step 2: Enqueue the starting node A
and set its STATUS = 2
(waiting state)
o Step 3: Repeat Steps 4 and 5 until
QUEUE is empty
o Step 4: Dequeue a node N. Process it
and set its STATUS = 3
(processed state).
o Step 5: Enqueue all the neighbours of
N that are in the ready state
(whose STATUS = 1) and set
their STATUS = 2
(waiting state)
[END OF LOOP]
o Step 6: EXIT

Example
Consider the graph G shown in the following image, calculate the minimum path p
from node A to node E. Given that each edge has a length of 1.
Solution:
Minimum Path P can be found by applying breadth first search algorithm that will
begin at node A and will end at E. the algorithm uses two queues,
namely QUEUE1 and QUEUE2. QUEUE1 holds all the nodes that are to be processed
while QUEUE2 holds all the nodes that are processed and deleted from QUEUE1.

Lets start examining the graph from Node A.

1. Add A to QUEUE1 and NULL to QUEUE2.

1. QUEUE1 = {A}
2. QUEUE2 = {NULL}

2. Delete the Node A from QUEUE1 and insert all its neighbours. Insert Node A
into QUEUE2

1. QUEUE1 = {B, D}
2. QUEUE2 = {A}

3. Delete the node B from QUEUE1 and insert all its neighbours. Insert node B into
QUEUE2.

1. QUEUE1 = {D, C, F}
2. QUEUE2 = {A, B}

4. Delete the node D from QUEUE1 and insert all its neighbours. Since F is the only
neighbour of it which has been inserted, we will not insert it again. Insert node D
into QUEUE2.

1. QUEUE1 = {C, F}
2. QUEUE2 = { A, B, D}

5. Delete the node C from QUEUE1 and insert all its neighbours. Add node C to
QUEUE2.

1. QUEUE1 = {F, E, G}
2. QUEUE2 = {A, B, D, C}
6. Remove F from QUEUE1 and add all its neighbours. Since all of its neighbours
has already been added, we will not add them again. Add node F to QUEUE2.

1. QUEUE1 = {E, G}
2. QUEUE2 = {A, B, D, C, F}

7. Remove E from QUEUE1, all of E's neighbours has already been added to
QUEUE1 therefore we will not add them again. All the nodes are visited and the
target node i.e. E is encountered into QUEUE2.

1. QUEUE1 = {G}
2. QUEUE2 = {A, B, D, C, F, E}

Now, backtrack from E to A, using the nodes available in QUEUE2.

The minimum path will be A → B → C → E.


Ex-
A
/ \
B C
/ / \
D E F
Output is:
A, B, C, D, E, F

Depth First Search (DFS) Algorithm


Depth first search (DFS) algorithm starts with the initial node of the graph G, and
then goes to deeper and deeper until we find the goal node or the node which has no
children. The algorithm, then backtracks from the dead end towards the most recent
node that is yet to be completely unexplored.

The data structure which is being used in DFS is stack. The process is similar to BFS
algorithm. In DFS, the edges that leads to an unvisited node are called discovery
edges while the edges that leads to an already visited node are called block edges.

Algorithm
o Step 1: SET STATUS = 1 (ready state) for each node in G
o Step 2: Push the starting node A on the stack and set its STATUS = 2 (waiting
state)
o Step 3: Repeat Steps 4 and 5 until STACK is empty
o Step 4: Pop the top node N. Process it and set its STATUS = 3 (processed
state)
o Step 5: Push on the stack all the neighbours of N that are in the ready state
(whose STATUS = 1) and set their
STATUS = 2 (waiting state)
[END OF LOOP]
o Step 6: EXIT

Example :
Consider the graph G along with its adjacency list, given in the figure below.
Calculate the order to print all the nodes of the graph starting from node H, by using
depth first search (DFS) algorithm.

Solution :
Push H onto the stack

1. STACK : H

POP the top element of the stack i.e. H, print it and push all the neighbours of H onto
the stack that are is ready state.

1. Print H
2. STACK : A

Pop the top element of the stack i.e. A, print it and push all the neighbours of A onto
the stack that are in ready state.

1. Print A
2. Stack : B, D

Pop the top element of the stack i.e. D, print it and push all the neighbours of D onto
the stack that are in ready state.

1. Print D
2. Stack : B, F

Pop the top element of the stack i.e. F, print it and push all the neighbours of F onto
the stack that are in ready state.

1. Print F
2. Stack : B

Pop the top of the stack i.e. B and push all the neighbours

1. Print B
2. Stack : C

Pop the top of the stack i.e. C and push all the neighbours.

1. Print C
2. Stack : E, G

Pop the top of the stack i.e. G and push all its neighbours.

1. Print G
2. Stack : E

Pop the top of the stack i.e. E and push all its neighbours.

1. Print E
2. Stack :

Hence, the stack now becomes empty and all the nodes of the graph have been
traversed.

The printing sequence of the graph will be :

1. H → A → D → F → B → C → G → E

Ex-
A
/ \
B C
/ / \
D E F
Output is:
A, B, D, C, E, F
Difference between BFS And DFS :-

A* search algorithm
A* is a graph traversal and path search algorithm, which is often used in many fields of computer
science due to its completeness, optimality, and optimal efficiency. One major practical drawback is
its O(b^d) space complexity, as it stores all generated nodes in memory.

What is A* Search Algorithm?


A* Search algorithm is one of the best and popular technique used in path-
finding and graph traversals.
Why A* Search Algorithm ?
Informally speaking, A* Search algorithms, unlike other traversal techniques, it
has “brains”. What it means is that it is really a smart algorithm which
separates it from the other conventional algorithms. This fact is cleared in detail
in below sections.
And it is also worth mentioning that many games and web-based maps use this
algorithm to find the shortest path very efficiently (approximation).

Explanation
Consider a square grid having many obstacles and we are given a starting cell
and a target cell. We want to reach the target cell (if possible) from the starting
cell as quickly as possible. Here A* Search Algorithm comes to the rescue.
What A* Search Algorithm does is that at each step it picks the node according
to a value-‘f’ which is a parameter equal to the sum of two other parameters –
‘g’ and ‘h’. At each step it picks the node/cell having the lowest ‘f’, and process
that node/cell.
We define ‘g’ and ‘h’ as simply as possible below
g = the movement cost to move from the starting point to a given square on the
grid, following the path generated to get there.
h = the estimated movement cost to move from that given square on the grid to
the final destination. This is often referred to as the heuristic, which is nothing
but a kind of smart guess. We really don’t know the actual distance until we find
the path, because all sorts of things can be in the way (walls, water, etc.). There
can be many ways to calculate this ‘h’ which are discussed in the later sections.
Algorithm
We create two lists – Open List and Closed List (just like Dijkstra Algorithm)

// A* Search Algorithm
1. Initialize the open list
2. Initialize the closed list
put the starting node on the open
list (you can leave its f at zero)

3. while the open list is not empty


a) find the node with the least f on
the open list, call it "q"

b) pop q off the open list

c) generate q's 8 successors and set their


parents to q

d) for each successor


i) if successor is the goal, stop search
successor.g = q.g + distance between
successor and q
successor.h = distance from goal to
successor (This can be done using many
ways, we will discuss three heuristics-
Manhattan, Diagonal and Euclidean
Heuristics)

successor.f = successor.g + successor.h

ii) if a node with the same position as


successor is in the OPEN list which has a
lower f than successor, skip this successor

iii) if a node with the same position as


successor is in the CLOSED list which has
a lower f than successor, skip this successor
otherwise, add the node to the open list
end (for loop)

e) push q on the closed list


end (while loop)
So suppose as in the below figure if we want to reach the target cell from the
source cell, then the A* Search algorithm would follow path as shown below.
Note that the below figure is made by considering Euclidean Distance as a
heuristics.

Data Visualization for Artificial


Intelligence

“Data visualization of the performance of algorithms for


the purpose of identifying anomalies and generating trust is
going to be the major growth area in data visualization in
the coming years.”
Data visualization algorithms create images from raw data and display hidden
correlations so that humans can process the information more effectively. Data
visualization is also an important evaluation metric for deep learning, since the
ultimate goal of artificial intelligence is to create a machine that can understand and
respond to data even better than a human could.

Machine Learning Data Visualization Examples:

When designing and evaluating a new algorithm, one of the first steps is exploratory
data analysis (EDA). The point is to find the most efficient learning approach for a
given problem. For the human researcher to understand what’s working and what’s
not, the model’s results are often displayed graphically. Since these datasets cover
many variables and are so “high dimensional,” several new data visualization
techniques have been developed specifically for deep learning systems. Some of the
most common new tools for interpreting high-dimensional relationships are:

 Parallel coordinate plots


 Scatterplot matrices
 Scagnostics
 Multidimensional scaling
 T-sne algorithm

What is Regression?

Introduction
Linear regression and logistic regression are two types of regression
analysis techniques that are used to solve the regression problem
using machine learning. They are the most prominent techniques of
regression. But, there are many types of regression analysis techniques in
machine learning, and their usage varies according to the nature of the data
involved.
This article will explain the different types of regression in machine learning,
and under what condition each of them can be used. If you are new to machine
learning, this article will surely help you in understanding the regression
modelling concept.
What is Regression Analysis?
Regression analysis is a predictive modelling technique that analyzes the
relation between the target or dependent variable and independent variable in
a dataset. The different types of regression analysis techniques get used when
the target and independent variables show a linear or non-linear relationship
between each other, and the target variable contains continuous values. The
regression technique gets used mainly to determine the predictor strength,
forecast trend, time series, and in case of cause & effect relation.
Regression analysis is the primary technique to solve the regression problems
in machine learning using data modelling. It involves determining the best fit
line, which is a line that passes through all the data points in such a way that
distance of the line from each data point is minimized.
Types of Regression Analysis Techniques
There are many types of regression analysis techniques, and the use of each
method depends upon the number of factors. These factors include the type of
target variable, shape of the regression line, and the number of independent
variables.
Below are the different regression techniques:
1. Linear Regression
2. Logistic Regression
3. Ridge Regression
4. Lasso Regression
5. Polynomial Regression
6. Bayesian Linear Regression
The different types of regression in machine learning techniques are explained
below in detail:
1. Linear Regression
Linear regression is one of the most basic types of regression in machine
learning. The linear regression model consists of a predictor variable and a
dependent variable related linearly to each other. In case the data involves
more than one independent variable, then linear regression is called multiple
linear regression models.
The below-given equation is used to denote the linear regression model:
y=mx+c+e
where m is the slope of the line, c is an intercept, and e represents the error in
the model.
The best fit line is determined by varying the values of m and c. The predictor
error is the difference between the observed values and the predicted value.
The values of m and c get selected in such a way that it gives the minimum
predictor error. It is important to note that a simple linear regression model is
susceptible to outliers. Therefore, it should not be used in case of big size data.
2. Logistic Regression
Logistic regression is one of the types of regression analysis technique, which
gets used when the dependent variable is discrete. Example: 0 or 1, true or
false, etc. This means the target variable can have only two values, and a
sigmoid curve denotes the relation between the target variable and the
independent variable.
Logit function is used in Logistic Regression to measure the relationship
between the target variable and independent variables. Below is the equation
that denotes the logistic regression.
logit(p) = ln(p/(1-p)) = b0+b1X1+b2X2+b3X3….+bkXk
where p is the probability of occurrence of the feature.

For selecting logistic regression, as the regression analyst technique, it should


be noted, the size of data is large with the almost equal occurrence of values to
come in target variables. Also, there should be no multicollinearity, which
means that there should be no correlation between independent variables in
the dataset.
3. Ridge Regression
This is another one of the types of regression in machine learning which is
usually used when there is a high correlation between the independent
variables. This is because, in the case of multi collinear data, the least square
estimates give unbiased values. But, in case the collinearity is very high, there
can be some bias value. Therefore, a bias matrix is introduced in the equation
of Ridge Regression. This is a powerful regression method where the model is
less susceptible to overfitting.

Below is the equation used to denote the Ridge Regression, where the
introduction of λ (lambda) solves the problem of multicollinearity:
β = (X^{T}X + λ*I)^{-1}X^{T}y
4. Lasso Regression
Lasso Regression is one of the types of regression in machine learning that
performs regularization along with feature selection. It prohibits the absolute
size of the regression coefficient. As a result, the coefficient value gets nearer
to zero, which does not happen in the case of Ridge Regression.
Due to this, feature selection gets used in Lasso Regression, which allows
selecting a set of features from the dataset to build the model. In the case of
Lasso Regression, only the required features are used, and the other ones are
made zero. This helps in avoiding the overfitting in the model. In case the
independent variables are highly collinear, then Lasso regression picks only
one variable and makes other variables to shrink to zero.
Below is the equation that represents the Lasso Regression method:
N^{-1}Σ^{N}_{i=1}f(x_{i}, y_{I}, α, β)
5. Polynomial Regression
Polynomial Regression is another one of the types of regression
analysis techniques in machine learning, which is the same as Multiple Linear
Regression with a little modification. In Polynomial Regression, the
relationship between independent and dependent variables, that is X and Y, is
denoted by the n-th degree.
It is a linear model as an estimator. Least Mean Squared Method is used in
Polynomial Regression also. The best fit line in Polynomial Regression that
passes through all the data points is not a straight line, but a curved line,
which depends upon the power of X or value of n.

While trying to reduce the Mean Squared Error to a minimum and to get the
best fit line, the model can be prone to overfitting. It is recommended to
analyze the curve towards the end as the higher Polynomials can give strange
results on extrapolation.
Below equation represents the Polynomial Regression:
l = β0+ β0x1+ε

6. Bayesian Linear Regression


Bayesian Regression is one of the types of regression in machine learning that
uses the Bayes theorem to find out the value of regression coefficients. In this
method of regression, the posterior distribution of the features is determined
instead of finding the least-squares. Bayesian Linear Regression is like both
Linear Regression and Ridge Regression but is more stable than the simple
Linear Regression.

Introduction to Clustering

It is basically a type of unsupervised learning method . An unsupervised


learning method is a method in which we draw references from datasets
consisting of input data without labelled responses. Generally, it is used as a
process to find meaningful structure, explanatory underlying processes,
generative features, and groupings inherent in a set of examples.
Clustering is the task of dividing the population or data points into a number of
groups such that data points in the same groups are more similar to other data
points in the same group and dissimilar to the data points in other groups. It is
basically a collection of objects on the basis of similarity and dissimilarity
between them.
For ex– The data points in the graph below clustered together can be classified
into one single group. We can distinguish the clusters, and we can identify that
there are 3 clusters in the below picture.
It is not necessary for clusters to be a spherical. Such as :

Clustering Methods :
 Density-Based Methods : These methods consider the clusters as the
dense region having some similarity and different from the lower dense
region of the space. These methods have good accuracy and ability to merge
two clusters.Example DBSCAN (Density-Based Spatial Clustering of
Applications with Noise) , OPTICS (Ordering Points to Identify Clustering
Structure) etc.

 Hierarchical Based Methods : The clusters formed in this method


forms a tree-type structure based on the hierarchy. New clusters are formed
using the previously formed one. It is divided into two category
 Agglomerative (bottom up approach)
 Divisive (top down approach)
examples CURE (Clustering Using Representatives), BIRCH (Balanced
Iterative Reducing Clustering and using Hierarchies) etc.
 Partitioning Methods : These methods partition the objects into k
clusters and each partition forms one cluster. This method is used to
optimize an objective criterion similarity function such as when the distance
is a major parameter example K-means, CLARANS (Clustering Large
Applications based upon Randomized Search) etc.

 Grid-based Methods : In this method the data space is formulated into a


finite number of cells that form a grid-like structure. All the clustering
operation done on these grids are fast and independent of the number of
data objects example STING (Statistical Information Grid), wave cluster,
CLIQUE (CLustering In Quest) etc.

Applications of Clustering in different fields


 Marketing : It can be used to characterize & discover customer segments for
marketing purposes.
 Biology : It can be used for classification among different species of plants
and animals.
 Libraries : It is used in clustering different books on the basis of topics and
information.
 Insurance : It is used to acknowledge the customers, their policies and
identifying the frauds.

Problem:
We also know the eight puzzle problem by the name of N puzzle problem or sliding puzzle
problem.
N-puzzle that consists of N tiles (N+1 titles with an empty tile) where N can be 8, 15, 24 and so on.
In our example N = 8. (that is square root of (8+1) = 3 rows and 3 columns).

In the same way, if we have N = 15, 24 in this way, then they have Row and columns as
follow (square root of (N+1) rows and square root of (N+1) columns).
That is if N=15 than number of rows and columns= 4, and if N= 24 number of rows and columns=
5
So, basically in these types of problems we have given a initial state or initial configuration (Start
state) and a Goal state or Goal Configuration.
Here We are solving a problem of 8 puzzle that is a 3x3 matrix.

INITIAL STATES GOAL STATES


Solution:
The puzzle can be solved by moving the tiles one by one in the single empty space and thus
achieving the Goal state.
Rules of solving puzzle
Instead of moving the tiles in the empty space we can visualize moving the empty space in place of
the tile.
The empty space can only move in four directions (Movement of empty space)
1. Up
2. Down
3. Right or
4. Left
The empty space cannot move diagonally and can take only one step at a time.

All possible move of a Empty tile

o- Position total possible moves are (2), x - position total possible moves are (3) and
#-position total possible moves are (4)
Let's solve the problem without Heuristic Search that is Uninformed Search or Blind Search
( Breadth First Search and Depth First Search)

Breath First Search to solve Eight puzzle problem


Note: If we solve this problem with depth first search, then it will go to depth instead of exploring
layer wise nodes.
Time complexity: In worst case time complexity in BFS is O(b^d) know as order of b raise to
power d. In this particular case it is (3^20).
b-branch factor
d-depth factor
Let's solve the problem with Heuristic Search that is Informed Search (A* , Best First Search
(Greedy Search))
To solve the problem with Heuristic search or informed search we have to calculate Heuristic
values of each node to calculate cost function. (f=g+h)

INITIAL STATE GOAL STATE

Note: See the initial state and goal state carefully all values except (4,5 and 8) are at their
respective places. so, the heuristic value for first node is 3.(Three values are misplaced to reach the
goal). And let's take actual cost (g) according to depth.
Note: Because of quick solution time complexity is less than that of Uninformed search but
optimal solution not possible.

You might also like