Professional Documents
Culture Documents
Ai Unit 2 Notes
Ai Unit 2 Notes
SUBJECT NOTES
(UNIT-2) :- Data & Algorithms , History Of Data , Data Storage And Importance of Data and its Acquisition ,
The Stages of data processing , Data Visualization , Regression, Prediction & Classification ,Clustering &
Recommender Systems.
:- A history of data
Data is part of the fabric of life and society — and has been for a long time. The history
of data is a long story detailing the evolution of data collection, storage and processing.
It's said that knowledge is power.
1.Personal data
Personal data is anything that is specific to you. It covers your demographics, your
location, your email address and other identifying factors.
2. Transactional data
Transactional data is anything that requires an action to collect. You might click
on an ad, make a purchase, visit a certain web page, etc.
3.Web data
Web data is a collective term which refers to any type of data you might pull from
the internet, whether to study for research purposes or otherwise. That might be
data on what your competitors are selling, published government data, football
scores, etc.
4.Sensor data
Sensor data is produced by objects and is often referred to as the Internet of Things.
It covers everything from your smartwatch measuring your heart rate to a building
with external sensors that measure the weather.
The data acquisition systems, which can be operated with analog signals are known
as analog data acquisition systems. Following are the blocks of analog data
acquisition systems.
Transducer − It converts physical quantities into electrical signals.
Signal conditioner − It performs the functions like amplification and selection of
desired portion of the signal.
Display device − It displays the input signals for monitoring purpose.
Graphic recording instruments − These can be used to make the record of input
data permanently.
Magnetic tape instrumentation − It is used for acquiring, storing & reproducing of
input data.
The data acquisition systems, which can be operated with digital signals are known
as digital data acquisition systems. So, they use digital components for storing or
displaying the information.
Mainly, the following operations take place in digital data acquisition.
Table of Contents
Types of Artificial Intelligence Algorithms.
o 1. Classification Algorithms.
a) Naive Bayes.
b) Decision Tree.
c) Random Forest.
d) Support Vector Machines.
e) K Nearest Neighbours.
o 2. Regression Algorithms.
a) Linear regression.
b) Lasso Regression.
c) Logistic Regression.
d) Multivariate Regression.
e) Multiple Regression Algorithm.
o 3. Clustering Algorithms.
a) K-Means Clustering.
b) Fuzzy C-means Algorithm.
c) Expectation-Maximisation (EM) Algorithm.
d) Hierarchical Clustering Algorithm.
Let’s wind up and conclude.
AI research is highly technical and specialized, deeply divided into subfields that often fail
to communicate with each other.[5] Some of the division is due to social and cultural
factors: subfields have grown up around particular institutions and the work of individual
researchers. AI research is also divided by several technical issues. There are subfields
which are focussed on the solution of specific problems, on one of several possible
approaches, on the use of widely differing tools and towards the accomplishment of
particular applications. The central problems of AI include such traits as
reasoning,knowledge, planning, learning, communication, perception and the ability to
move and manipulate objects.[6] General intelligence (or "strong AI") is still among the
field's long term goals.[7] Currently popular approaches include statistical methods,
computational intelligence and traditional symbolic AI.
In computer science and operations research, a genetic algorithm is a metaheuristic inspired by the
process of natural selection that belongs to the larger class of evolutionary algorithms.
The process of natural selection starts with the selection of fittest individuals from
a population. They produce offspring which inherit the characteristics of the
parents and will be added to the next generation. If parents have better fitness,
their offspring will be better than parents and have a better chance at surviving.
This process keeps on iterating and at the end, a generation with the fittest
individuals will be found.
This notion can be applied for a search problem. We consider a set of solutions for
a problem and select the set of best ones out of them.
1. Initial population
2. Fitness function
3. Selection
4. Crossover
5. Mutation
Initial Population
The process begins with a set of individuals which is called a Population. Each
individual is a solution to the problem you want to solve.
Fitness Function
Selection
The idea of selection phase is to select the fittest individuals and let them pass
their genes to the next generation.
Two pairs of individuals (parents) are selected based on their fitness scores.
Individuals with high fitness have more chance to be selected for reproduction.
Crossover
Crossover is the most significant phase in a genetic algorithm. For each pair of
parents to be mated, a crossover point is chosen at random from within the
genes.For example, consider the crossover point to be 3 as shown below.
Crossover point
Offspring are created by exchanging the genes of parents among themselves until
the crossover point is reached.
Exchanging genes among parentsThe new offspring are added to the population.
New offspring
Mutation
The algorithm terminates if the population has converged (does not produce
offspring which are significantly different from the previous generation). Then it
is said that the genetic algorithm has provided a set of solutions to our problem.
The algorithms we shall discuss in this tutorial are the decisionbased algorithms as well as the simplest of the
recursive type algorithms. This is because they are easy to comprehend as well aseasy toimplement. We will
ignore BFS, Dijkstraand A* since these are very complex algorithms which may require special data structures
and are harder to implement. Path-finding consumes a significant amount of resources, especially in
movement-intensive games such as (massively) multiplayer games.
We investigate several path-finding techniques, and explore the impact on performance of workloads derived
from real player movement sin a multiplayer game. We find that a mapconforming, hierarchical path-finding
strategy performs best, and in combination with caching optimizations can greatly re-duce path-finding cost.
Performance is dominated primarily by algorithm, and only to a lesser degree byworkload variation.
Understanding the real impact of path-finding techniques allows for refined testing and optimization of game
design.
Heuristic Function
: A heuristic is a technique that improves the efficiency of search process, possibly by sacrificing claims of
completeness. While the almost perfect heuristic is significant for theoretical analysis, it is not common to
find such a heuristic in practice.heuristics play a major role in search strategies because of exponential nature
of the most problems.
Heuristic help to reduce the number of alternatives from an exponentialnumberto polynomial number.
Heuristic search has been widely used in both deterministic and probabilistic planning. Yet, although
bidirectional heuristic search has been applied broadly in deterministic planning problems, its function in the
probabilistic domains is only sparsely studied.
A heuristic function is a function that maps from problem state descriptions to measures of desirability,
usually represented as number. Heuristic functions generally have different errors in different states. Heuristic
functions play a crucial rule in optimal planning, and the theoretical limitations of algorithms using such
functions are therefore of interest. Much work has focused on finding bounds on the behavior of heuristic
search algorithms, using heuristics with specific attributes.
“A heuristic function, also called simply a heuristic, is a function that ranks alternatives in
search algorithms at each branching step based on available information to decide which
branch to follow. For example, it may approximate the exact solution.”
Breadth First Search (BFS) Algorithm
BFS stands for Breadth First Search is a vertex based technique for finding a
shortest path in graph. It uses a Queue data structure which follows first in first
out. In BFS, one vertex is selected at a time when it is visited and marked then
its adjacent are visited and stored in the queue. It is slower than DFS.
Breadth first search is a graph traversal algorithm that starts traversing the graph
from root node and explores all the neighbouring nodes. Then, it selects the nearest
node and explore all the unexplored nodes. The algorithm follows the same process
for each of the nearest node until it finds the goal.
The algorithm of breadth first search is given below. The algorithm starts with
examining the node A and all of its neighbours. In the next step, the neighbours of
the nearest node of A are explored and process continues in the further steps. The
algorithm explores all neighbours of all the nodes and ensures that each node is
visited exactly once and no node is visited twice.
Algorithm
o Step 1: SET STATUS = 1 (ready state)
for each node in G
o Step 2: Enqueue the starting node A
and set its STATUS = 2
(waiting state)
o Step 3: Repeat Steps 4 and 5 until
QUEUE is empty
o Step 4: Dequeue a node N. Process it
and set its STATUS = 3
(processed state).
o Step 5: Enqueue all the neighbours of
N that are in the ready state
(whose STATUS = 1) and set
their STATUS = 2
(waiting state)
[END OF LOOP]
o Step 6: EXIT
Example
Consider the graph G shown in the following image, calculate the minimum path p
from node A to node E. Given that each edge has a length of 1.
Solution:
Minimum Path P can be found by applying breadth first search algorithm that will
begin at node A and will end at E. the algorithm uses two queues,
namely QUEUE1 and QUEUE2. QUEUE1 holds all the nodes that are to be processed
while QUEUE2 holds all the nodes that are processed and deleted from QUEUE1.
1. QUEUE1 = {A}
2. QUEUE2 = {NULL}
2. Delete the Node A from QUEUE1 and insert all its neighbours. Insert Node A
into QUEUE2
1. QUEUE1 = {B, D}
2. QUEUE2 = {A}
3. Delete the node B from QUEUE1 and insert all its neighbours. Insert node B into
QUEUE2.
1. QUEUE1 = {D, C, F}
2. QUEUE2 = {A, B}
4. Delete the node D from QUEUE1 and insert all its neighbours. Since F is the only
neighbour of it which has been inserted, we will not insert it again. Insert node D
into QUEUE2.
1. QUEUE1 = {C, F}
2. QUEUE2 = { A, B, D}
5. Delete the node C from QUEUE1 and insert all its neighbours. Add node C to
QUEUE2.
1. QUEUE1 = {F, E, G}
2. QUEUE2 = {A, B, D, C}
6. Remove F from QUEUE1 and add all its neighbours. Since all of its neighbours
has already been added, we will not add them again. Add node F to QUEUE2.
1. QUEUE1 = {E, G}
2. QUEUE2 = {A, B, D, C, F}
7. Remove E from QUEUE1, all of E's neighbours has already been added to
QUEUE1 therefore we will not add them again. All the nodes are visited and the
target node i.e. E is encountered into QUEUE2.
1. QUEUE1 = {G}
2. QUEUE2 = {A, B, D, C, F, E}
The data structure which is being used in DFS is stack. The process is similar to BFS
algorithm. In DFS, the edges that leads to an unvisited node are called discovery
edges while the edges that leads to an already visited node are called block edges.
Algorithm
o Step 1: SET STATUS = 1 (ready state) for each node in G
o Step 2: Push the starting node A on the stack and set its STATUS = 2 (waiting
state)
o Step 3: Repeat Steps 4 and 5 until STACK is empty
o Step 4: Pop the top node N. Process it and set its STATUS = 3 (processed
state)
o Step 5: Push on the stack all the neighbours of N that are in the ready state
(whose STATUS = 1) and set their
STATUS = 2 (waiting state)
[END OF LOOP]
o Step 6: EXIT
Example :
Consider the graph G along with its adjacency list, given in the figure below.
Calculate the order to print all the nodes of the graph starting from node H, by using
depth first search (DFS) algorithm.
Solution :
Push H onto the stack
1. STACK : H
POP the top element of the stack i.e. H, print it and push all the neighbours of H onto
the stack that are is ready state.
1. Print H
2. STACK : A
Pop the top element of the stack i.e. A, print it and push all the neighbours of A onto
the stack that are in ready state.
1. Print A
2. Stack : B, D
Pop the top element of the stack i.e. D, print it and push all the neighbours of D onto
the stack that are in ready state.
1. Print D
2. Stack : B, F
Pop the top element of the stack i.e. F, print it and push all the neighbours of F onto
the stack that are in ready state.
1. Print F
2. Stack : B
Pop the top of the stack i.e. B and push all the neighbours
1. Print B
2. Stack : C
Pop the top of the stack i.e. C and push all the neighbours.
1. Print C
2. Stack : E, G
Pop the top of the stack i.e. G and push all its neighbours.
1. Print G
2. Stack : E
Pop the top of the stack i.e. E and push all its neighbours.
1. Print E
2. Stack :
Hence, the stack now becomes empty and all the nodes of the graph have been
traversed.
1. H → A → D → F → B → C → G → E
Ex-
A
/ \
B C
/ / \
D E F
Output is:
A, B, D, C, E, F
Difference between BFS And DFS :-
A* search algorithm
A* is a graph traversal and path search algorithm, which is often used in many fields of computer
science due to its completeness, optimality, and optimal efficiency. One major practical drawback is
its O(b^d) space complexity, as it stores all generated nodes in memory.
Explanation
Consider a square grid having many obstacles and we are given a starting cell
and a target cell. We want to reach the target cell (if possible) from the starting
cell as quickly as possible. Here A* Search Algorithm comes to the rescue.
What A* Search Algorithm does is that at each step it picks the node according
to a value-‘f’ which is a parameter equal to the sum of two other parameters –
‘g’ and ‘h’. At each step it picks the node/cell having the lowest ‘f’, and process
that node/cell.
We define ‘g’ and ‘h’ as simply as possible below
g = the movement cost to move from the starting point to a given square on the
grid, following the path generated to get there.
h = the estimated movement cost to move from that given square on the grid to
the final destination. This is often referred to as the heuristic, which is nothing
but a kind of smart guess. We really don’t know the actual distance until we find
the path, because all sorts of things can be in the way (walls, water, etc.). There
can be many ways to calculate this ‘h’ which are discussed in the later sections.
Algorithm
We create two lists – Open List and Closed List (just like Dijkstra Algorithm)
// A* Search Algorithm
1. Initialize the open list
2. Initialize the closed list
put the starting node on the open
list (you can leave its f at zero)
When designing and evaluating a new algorithm, one of the first steps is exploratory
data analysis (EDA). The point is to find the most efficient learning approach for a
given problem. For the human researcher to understand what’s working and what’s
not, the model’s results are often displayed graphically. Since these datasets cover
many variables and are so “high dimensional,” several new data visualization
techniques have been developed specifically for deep learning systems. Some of the
most common new tools for interpreting high-dimensional relationships are:
What is Regression?
Introduction
Linear regression and logistic regression are two types of regression
analysis techniques that are used to solve the regression problem
using machine learning. They are the most prominent techniques of
regression. But, there are many types of regression analysis techniques in
machine learning, and their usage varies according to the nature of the data
involved.
This article will explain the different types of regression in machine learning,
and under what condition each of them can be used. If you are new to machine
learning, this article will surely help you in understanding the regression
modelling concept.
What is Regression Analysis?
Regression analysis is a predictive modelling technique that analyzes the
relation between the target or dependent variable and independent variable in
a dataset. The different types of regression analysis techniques get used when
the target and independent variables show a linear or non-linear relationship
between each other, and the target variable contains continuous values. The
regression technique gets used mainly to determine the predictor strength,
forecast trend, time series, and in case of cause & effect relation.
Regression analysis is the primary technique to solve the regression problems
in machine learning using data modelling. It involves determining the best fit
line, which is a line that passes through all the data points in such a way that
distance of the line from each data point is minimized.
Types of Regression Analysis Techniques
There are many types of regression analysis techniques, and the use of each
method depends upon the number of factors. These factors include the type of
target variable, shape of the regression line, and the number of independent
variables.
Below are the different regression techniques:
1. Linear Regression
2. Logistic Regression
3. Ridge Regression
4. Lasso Regression
5. Polynomial Regression
6. Bayesian Linear Regression
The different types of regression in machine learning techniques are explained
below in detail:
1. Linear Regression
Linear regression is one of the most basic types of regression in machine
learning. The linear regression model consists of a predictor variable and a
dependent variable related linearly to each other. In case the data involves
more than one independent variable, then linear regression is called multiple
linear regression models.
The below-given equation is used to denote the linear regression model:
y=mx+c+e
where m is the slope of the line, c is an intercept, and e represents the error in
the model.
The best fit line is determined by varying the values of m and c. The predictor
error is the difference between the observed values and the predicted value.
The values of m and c get selected in such a way that it gives the minimum
predictor error. It is important to note that a simple linear regression model is
susceptible to outliers. Therefore, it should not be used in case of big size data.
2. Logistic Regression
Logistic regression is one of the types of regression analysis technique, which
gets used when the dependent variable is discrete. Example: 0 or 1, true or
false, etc. This means the target variable can have only two values, and a
sigmoid curve denotes the relation between the target variable and the
independent variable.
Logit function is used in Logistic Regression to measure the relationship
between the target variable and independent variables. Below is the equation
that denotes the logistic regression.
logit(p) = ln(p/(1-p)) = b0+b1X1+b2X2+b3X3….+bkXk
where p is the probability of occurrence of the feature.
Below is the equation used to denote the Ridge Regression, where the
introduction of λ (lambda) solves the problem of multicollinearity:
β = (X^{T}X + λ*I)^{-1}X^{T}y
4. Lasso Regression
Lasso Regression is one of the types of regression in machine learning that
performs regularization along with feature selection. It prohibits the absolute
size of the regression coefficient. As a result, the coefficient value gets nearer
to zero, which does not happen in the case of Ridge Regression.
Due to this, feature selection gets used in Lasso Regression, which allows
selecting a set of features from the dataset to build the model. In the case of
Lasso Regression, only the required features are used, and the other ones are
made zero. This helps in avoiding the overfitting in the model. In case the
independent variables are highly collinear, then Lasso regression picks only
one variable and makes other variables to shrink to zero.
Below is the equation that represents the Lasso Regression method:
N^{-1}Σ^{N}_{i=1}f(x_{i}, y_{I}, α, β)
5. Polynomial Regression
Polynomial Regression is another one of the types of regression
analysis techniques in machine learning, which is the same as Multiple Linear
Regression with a little modification. In Polynomial Regression, the
relationship between independent and dependent variables, that is X and Y, is
denoted by the n-th degree.
It is a linear model as an estimator. Least Mean Squared Method is used in
Polynomial Regression also. The best fit line in Polynomial Regression that
passes through all the data points is not a straight line, but a curved line,
which depends upon the power of X or value of n.
While trying to reduce the Mean Squared Error to a minimum and to get the
best fit line, the model can be prone to overfitting. It is recommended to
analyze the curve towards the end as the higher Polynomials can give strange
results on extrapolation.
Below equation represents the Polynomial Regression:
l = β0+ β0x1+ε
Introduction to Clustering
Clustering Methods :
Density-Based Methods : These methods consider the clusters as the
dense region having some similarity and different from the lower dense
region of the space. These methods have good accuracy and ability to merge
two clusters.Example DBSCAN (Density-Based Spatial Clustering of
Applications with Noise) , OPTICS (Ordering Points to Identify Clustering
Structure) etc.
Problem:
We also know the eight puzzle problem by the name of N puzzle problem or sliding puzzle
problem.
N-puzzle that consists of N tiles (N+1 titles with an empty tile) where N can be 8, 15, 24 and so on.
In our example N = 8. (that is square root of (8+1) = 3 rows and 3 columns).
In the same way, if we have N = 15, 24 in this way, then they have Row and columns as
follow (square root of (N+1) rows and square root of (N+1) columns).
That is if N=15 than number of rows and columns= 4, and if N= 24 number of rows and columns=
5
So, basically in these types of problems we have given a initial state or initial configuration (Start
state) and a Goal state or Goal Configuration.
Here We are solving a problem of 8 puzzle that is a 3x3 matrix.
o- Position total possible moves are (2), x - position total possible moves are (3) and
#-position total possible moves are (4)
Let's solve the problem without Heuristic Search that is Uninformed Search or Blind Search
( Breadth First Search and Depth First Search)
Note: See the initial state and goal state carefully all values except (4,5 and 8) are at their
respective places. so, the heuristic value for first node is 3.(Three values are misplaced to reach the
goal). And let's take actual cost (g) according to depth.
Note: Because of quick solution time complexity is less than that of Uninformed search but
optimal solution not possible.