AI ML Notes 2

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 151

Artificial Intelligence and Machine Learning

Subject Code: 19ET7DEAIM

Dr. Preeti Khanwalkar

Department of Electrictronics and Telecommunication Engineering


Dayananda Sagar College of Engineering, Bangalore
Artificial Intelligence and Machine Learning

when AI started to see a lot of popularity, there was much debate


among scientists and researchers about making a machine that
would have the ability to learn, adapt, and make decisions.
The general idea was to have machines replicate the way the
human brain learns.
Artificial Intelligence and Machine Learning

AI is the attempt to build computational models of cognitive


processes.

AI is about generating representations and procedures that allow


machines to perform tasks that would be considered intelligent if
performed by a human.
Artificial Intelligence and Machine Learning

Artificial Intelligence, first coined by John McCarthy, in 1956- as


“The science and engineering of making intelligent machines,
especially intelligent computer programs”.
Artificial Intelligence and Machine Learning

Artificial Intelligence is a way of making a computer, a


computer-controlled robot, or a software think intelligently, in the
similar manner the intelligent humans think.

Goals of AI is to Create Expert Systems - The systems which


exhibit intelligent behavior, learn, demonstrate, explain, and
advice its users.

To Implement Human Intelligence in Machines - Creating systems


that understand, think, learn, and behave like humans.
Machine Learning
Machine Learning: Its a science of getting computers to learn
without being explicitly programmed.
Supervised, Unsupervised and Reinforcement
Supervised: Basically supervised learning is when we teach or train
the machine using data that is well-labelled. Which means some
data is already tagged with the correct answer. After that, the
machine is provided with a new set of examples(data) so that the
supervised learning algorithm analyses the training data(set of
training examples) and produces a correct outcome from labeled
data.
Supervised learning is classified into two categories of algorithms:
Classification: A classification problem is when the output variable is
a category, such as Yes or No , “disease“ or ”no disease“,
Algorithms: Naive Bayes Classifiers,K-NN (k nearest
neighbors),Decision Trees,Support Vector Machine
Regression: A regression problem is when the output variable is a
real value, such as House price prdiction in ’dollars or weight
prediction
Algorithms: Linear Regression, Logistic Regression
Machine Learning Cont..

Unsupervised: Unsupervised learning is the training of a machine


using information that is neither classified nor labeled and allowing
the algorithm to act on that information without guidance.
Algorithms: Clustering, Principal compnent Analysis
Reinforcement Learning: Reinforcement learning is a machine
learning training method based on rewarding desired behaviors
and punishing undesired ones. In general, a reinforcement
learning agent – the entity being trained – is able to perceive and
interpret its environment, take actions and learn through trial and
error.
Artificial Intelligence and Machine Learning cont..

one or multiple areas can contribute to build an intelligent system


Artificial Intelligence and Machine Learning cont..

Programming Without AI Programming With AI


A computer program without AI can A computer program with AI can
answer the specific questions it is answer the generic questions it is
meant to solve. meant to solve.
Modification in the program leads to AI programs can absorb new modifi-
change in its structure. cations by highly independent pieces
of information together. Hence we
can modify even a minute piece of in-
formation of program without affect-
ing its structure.
Modification is not quick and easy. Quick and Easy program modifica-
It may lead to affecting the program tion.
adversely.
Applications of AI

Gaming- AI plays crucial role in strategic games such as chess,


poker, tic-tac-toe, etc., where machine can think of large number
of possible positions based on heuristic knowledge.
Natural Language Processing - It is possible to interact with the
computer that understands natural language spoken by humans.
Expert Systems -There are some applications which integrate
machine, software, and special information to impart reasoning
and advising. They provide explanation and advice to the users.
Vision Systems - These systems understand, interpret, and
comprehend visual input on the computer. For example,
A spying aeroplane takes photographs, which are used to figure out
spatial information or map of the areas.
Doctors use clinical expert system to diagnose the patient.
Police use computer software that can recognize the face of
criminal with the stored portrait made by forensic artist.
Applications of AI
Speech Recognition- Some intelligent systems are capable of
hearing and comprehending the language in terms of sentences
and their meanings while a human talks to it. It can handle
different accents, slang words, noise in the background, change in
human’s noise due to cold, etc.
Handwriting Recognition- The handwriting recognition software
reads the text written on paper by a pen or on screen by a stylus.
It can recognize the shapes of the letters and convert it into
editable text.
Intelligent Robots- Robots are able to perform the tasks given by a
human. They have sensors to detect physical data from the real
world such as light, heat, temperature, movement, sound, bump,
and pressure. They have efficient processors, multiple sensors
and huge memory, to exhibit intelligence. In addition, they are
capable of learning from their mistakes and they can adapt to the
new environment.
AI Techniques

Artificial Intelligence research during the last three decades has


concluded that Intelligence requires knowledge. To compensate
overwhelming quality, knowledge possesses less desirable
properties.
It is huge.
It is difficult to characterize correctly.
It is constantly varying.
It differs from data by being organized in a way that corresponds to
its application.
It is complicated.
AI Techniques

An AI technique is a method that exploits knowledge that is


represented so that:-
The knowledge captures generalizations that share properties, are
grouped together, rather than being allowed separate
representation.
It can be understood by people who must provide it-even though for
many programs bulk of the data comes automatically from readings.
In many AI domains, how the people understand the same people
must supply the knowledge to a program.
It can be easily modified to correct errors and reflect changes in
real conditions.
It can be widely used even if it is incomplete or inaccurate.
It can be used to help overcome its own sheer bulk by helping to
narrow the range of possibilities that must be usually considered.
AI Techniques

AI Techniques depict how we represent, manipulate and reason


with knowledge in order to solve problems.

Knowledge is a collection of ’facts’.

To manipulate these facts by a program, a suitable representation


is required.

A good representation facilitates problem solving.


Neroscience and Neural Networks Analogy
The parts of nerve cell in brain or neuron.
Each neuron consists of a cell body, that contains cell nucleus.
Brancing of cell body number of fibers called dendrites and single long fiber
axon, which streches out longer distance
A neuron makes connection with 10 to 10,000 neurons at junctions called
synapses.
Signals are propogated from one neuron to another by a complicated
electrochemical reaction.
The signal controls brains short term and long term activity based on the
connections with neuron
Neural Networks

Neurons transduce signals-electrical to chemical, and from


chemical back again to electrical
Each synapseis associated with what we call the synaptic efficacy
the efficiency with which a signal is transmitted from the
presynaptic to postsynaptic neuron.
Neural Networks
jth artificial neuron that receives input signals si from possibly n
different sources an internal activation x
an internal activation xj which is a linear weighted aggregation of
the impinging signals, modified by an internal threshold, θj
connection weights w model
wij denotes the weight from neuron i to neuron j .
Neural Networks
jth artificial neuron that receives input signals si from possibly n
different sources an internal activation x
an internal activation xj which is a linear weighted aggregation of
the impinging signals, modified by an internal threshold, θj
connection weights w model
wij denotes the weight from neuron i to neuron j .
Neural Networks

An Artificial Neural Network is specified by:


neuron model: the information processing unit of the NN,
an architecture: a set of neurons and links connecting neurons.
Each link has a weight,
a learning algorithm: used for training the NN by modifying the
weights in order to model a particular learning task correctly on the
training examples.
The aim is to obtain a NN that is trained and generalizes well.
It should behaves correctly on new instances of the learning task.
Neural Networks
The ANN(Artificial Neural Network) is based on BNN(Biological
Neural Network) as its primary goal is to fully imitate the Human
Brain and its functions.
Similar to the brain having neurons interlinked to each other, the
ANN also has neurons that are linked to each other in various
layers of the networks which are known as nodes.
Neural Networks

The ANN learns through various learning algorithms that are


described as supervised or unsupervised learning.
In supervised learning algorithms, the target values are labeled.
Its goal is to try to reduce the error between the desired output
(target) and the actual output for optimization.
In unsupervised learning algorithms, the target values are not
labeled and the network learns by itself by identifying the patterns
through repeated trials and experiments
Neural Networks

one or multiple areas can contribute to build an intelligent system


Characteristics of Neural Networks

It is a neurally implemented, mathematical model.


A learning process implemented to acquire knowledge.
Modeling of the system with an unknown input-output relationship.
It contains a huge number of interconnected processing elements called
neurons to do all operations.
Neutral network stimulates the biological systems, where learning involved
adjustment to the synthetic connection between neurons.
A neural network consists of a large number of neuron-like processing elements.
All of these processing components have a large number of weighted
interconnections. In addition, the connection between the elements provides a
distributed representation of data.
Characteristics of Neural Networks

The input signals arrive at the processing elements through connection and
connecting weights. Information stored in the neurons is basically the weighted
linkage of neurons.
It has the ability to learn, recall, and generalize from the data provided by proper
assignment and weight adjustment.
The collective behavior of the neurons its computational power, and no signal
neurons carry specific information.
It provides adaptive response to change in the surrounding environment.
Neural networks can be useful for pattern recognition or data classifications,
through a learning process.
Neural Networks
The neuron is the basic information processing unit of a NN. It consists of-
A set of links, describing the neuron inputs, with weights W1 , W2 , · · · Wm
An adder function (linear combiner) for computing the weighted sum of the
inputs (real numbers):
m
X
u= Wj Xj
j=1

Activation function Φ for limiting the amplitude of the neuron output. Here ’b’
denotes bias.
y = Φ(u + b)
The bias b has the effect of applying a transformation to the weighted sum u,
v=u+b
The bias is an external parameter of the neuron. It can be modeled by adding an
extra input.
v is called induced field of the neuron
m
X
v= Wj Xj , corresponding to j = 0, W0 = b
j=0
,
ANN: Terminologies
Weights indicated, weighted inputs connected to the neurons. if
there are ’n’ nodes with each node having ’m’ weights, then it
represented as-
Bias: Bias is a constant that is added to the product of inputs and
weights to calculate the product.
Bias is used to shift the result to the positive or negative side. The
net input weight is increased by a positive bias while The net input
weight is decreased by a negative bias.
ANN: Terminologies

Here,{1, x1 , · · · xn } are the inputs, and the output (Y) neurons will
be computed by the function g(x) which sums up all the input and
adds bias to it. g(x)= xi wi +b where i=1 to n = x1 w1 +· · · +xn wn +b
P

and the role of the activation is to provide the output depending on


the results of the summation function: Y (g(x))
ANN: Terminologies

Threshold: A threshold value is a constant value that is compared


to the net input to get the output. The activation function is defined
based on the threshold value to calculate the output.
For Example:
Y=1 if net input≥threshold
Y=0 otherwise
Learning Rate (α): The learning rate is used to control the amount
of weight adjustment at each step of training. The learning rate
ranges from 0 to 1. It determines the rate of learning at each time
step
Target value: Target values are Correct values of the output
variable and are also known as just targets.
Error: It is the inaccuracy of predicted output values compared to
Target Values
Neuon Models

The choice of activation functions determines the neuron model. Few Examples
are-
Neuon Models
Neuon Models
Neuon Models
Basic Learning Laws

Operation of neural network is governed by neuronal dynamics.


Neuronal dynamics consists of two parts: one corresponding to
the dynamics of activation state and other corresponding to the
dynamics of the synaptic weights.
Short Term Memory(STM) in neural network is modeled by
activation state of the network
Short Term Memory(LTM) corresponds to the encoded pattern
information in the synaptic weights due to learning.
Basic Learning Laws

Learning is a process by which the free parameters of a neural


network are adapted through a process of stimulation by the
environment in which the network is embedded.
The type of the learning is determined by the manner in which the
parameter changes take
Learning is used to update the weights w
Neural networks can be classified by the algorithm used for
learning
Two most used learning mechanisms are: Supervised,
Unsupervised
Basic Learning Laws

Learning is a process by which the free parameters of a neural


network are adapted through a process of stimulation by the
environment in which the network is embedded.
The type of the learning is determined by the manner in which the
parameter changes take
Learning is used to update the weights w
Neural networks can be classified by the algorithm used for
learning
Two most used learning mechanisms are: Supervised,
Unsupervised
Basic Learning Laws: Supervised learning

In supervised learning the weight changes are determined by the


difference between the desired output and the actual output.
Some of the supervised learning laws are: error correction
learning or delta rule, stochastic learning, and hardwired systems
Supervised learning may be used for structural learning or for
temporal learning.
Structural learning is concerned with capturing in the weights the
relationship between a given input-output pattern pair.
Temporal learning is concerned with capturing in the weights the
relationship between neighbouring patterns in a sequence of
patterns.
Supervised Learning Algorithm Examples

Delta Learning: It was introduced by Bernard Widrow and Marcian Hoff and is
also known as Least Mean Square Method. It reduces the error over the entire
learning and training process. In order to minimize error, it follows the gradient
descent method in which the Activation Function continues forever.

Outstar Learning: It was first proposed by Grossberg in 1976, where we use the
concept that a Neural Network is arranged in layers, and weights connected
through a particular node should be equal to the desired output resulting in
neurons that are connected with those weights.
Basic Learning Laws: Unsupervised learning

Unsupervised learning discovers features in a given set of


patterns and organizes the patterns accordingly.
There is no externally specified desired output as in the case of
supervised learning.
Examples of unsupervised learning laws are: Hebbian learning,
differential Hebbian learning, principle component learning and
competitive learning.
Unsupervised learning uses mostly local information to update the
weights.
The local information consists of signal or activation values of the
units at either end of the connection for which the weight update is
being made.
Unsupervised Learning Algorithm Examples

Hebbian Learning: It was proposed by Hebb in 1949 to improve the weights of


nodes in a network. The change in weight is based on input, output, and
learning rate. the transpose of the output is needed for weight adjustment.

Competitive Learning: It is a winner takes all strategy. Here, when an input


pattern is sent to the network, all the neurons in the layer compete with each
other to represent the input pattern, the winner gets the output as 1 and all the
others 0, and only the winning neurons have weight adjustments.
Unsupervised and Supervised Learning Laws
Neural Network Architectures

Artificial neural networks are massively parallel adaptive networks


of simple nonlinear computing elements called neurons which are
intended to abstract and model some of the functionality of the
human nervous system in an attempt to partially capture some of
its computational strengths. Eight Components of Neural
Networks-
Neurons. These can be of three types:
– Input: receive external stimuli
– Hidden: compute intermediate functions
– Output: generate outputs from the network
Activation state vector. This is a vector of the activation level xi of
individual neurons in the neural network, X=( x1 , · · · xn )T ∈ Rn
Signal function. A function that generates the output signal of the
neuron based on its activation.
Neural Network Architectures

Pattern of connectivity. This essentially determines the inter-neuron connection


architecture or the graph of the network. Connections which model the
inter-neuron synaptic efficacies, can be- –excitatory (+)
–inhibitory (-)
–absent (0).
Activity aggregation rule. A way of aggregating activity at a neuron, and is
usually computed as an inner product of the input vector and the neuron fan-in
weight vector.
Activation rule: A function that determines the new activation level of a neuron
on the basis of its current activation and its external inputs.
Learning rule: Provides a means of modifying connection strengths based both
on external stimuli and network performance with an aim to improve the latter.
Environment. The environments within which neural networks can operate could
be-deterministic (noiseless) or stochastic (noisy).
Neural Network Architectures

Three different classes of network architectures


-single-layer feed-forward
-multi-layer feed-forward
-recurrent
The architecture of a neural network is linked with the learning
algorithm that is used to train
Single Layer Feed-forward
Neural Network Architectures: Perceptron- Neuron
Model
Special form of single layer feed forward architecture
The perceptron was first proposed by Rosenblatt (1958) is a
simple neuron that is used to classify its input into one of two
categories.
A perceptron uses a step function that returns +1 if weighted sum
of its input≥ 0 and -1 otherwise
Perceptron for Classification
The perceptron is used for binary classification.
First train a perceptron for a classification task.
-Find suitable weights in such a way that the training examples are
correctly classified.
-Geometrically try to find a hyper-plane that separates the
examples of the two classes.
a hyperplane is a decision boundary that divides the input space
into two or more regions, each corresponding to a different class
or output label. In a 2D space, a hyperplane is a straight line that
divides the space into two halves.
Perceptron for Classification
The perceptron can only model linearly separable classes.
When the two classes are not linearly separable, it may be
desirable to obtain a linear separator that minimizes the mean
squared error.
Given training examples of classes C1, C2 train the perceptron in
such a way that :
-If the output of the perceptron is +1 then the input is assigned to
class C 1
-If the output is -1 then the input is assigned to C2
Boolean function OR- Linearly separable
Learning Process for Perceptron

Initially assign random weights to inputs between -0.5 and +0.5


Training data is presented to perceptron and its output is
observed.
If output is incorrect, the weights are adjusted accordingly using
following formula. wi ← wi + (a*xi *e), where ’e’ is error produced
and ’a’ (-1≤ a ≤ 1) is learning rate
’a’ is defined as 0 if output is correct, it is +ve, if output is too low
and -ve, if output is too high.
Once the modification to weights has taken place, the next piece
of training data is used in the same way.
Once all the training data have been applied, the process starts
again until all the weights are correct and all errors are zero.
Each iteration of this process is known as an epoch.
Learning Process for Perceptron Example 1
initially consider w1 = -0.2 and w2 = 0.4
Training data say, x1 = 0 and x2 = 0, output is 0.
Compute y = Step(w1 *x1 + w2 *x2 ) = 0. Output is correct so
weights are not changed.
For training data x1=0 and x2 = 1, output is 1
Compute y = Step(w1 *x1 + w2 *x2 ) = 0.4 = 1. Output is correct so
weights are not changed.
Next training data x1=1 and x2 = 0 and output is 1
Compute y = Step(w1 *x1 + w2 *x2 ) = - 0.2 = 0. Output is incorrect,
hence weights are to be changed.
Assume a = 0.2 and error e=1 wi ← wi + (a*xi *e) gives w1 = 0
and w2 =0.4
With these weights, test the remaining test data.
Repeat the process till we get stable result.
Learning Process for Perceptron Example 2
Learning Process for Perceptron Example 2

So far we have described the forward pass, meaning given an


input and weights how the output is computed. After the training is
complete, we only run the forward pass to make the predictions.
But we first need to train our model to actually learn the weights,
and the training procedure works as follows:
Randomly initialize the weights for all the nodes.
For every training example, perform a forward pass using the
current weights, and calculate the output of each node going from
left to right. The final output is the value of the last node.
Compare the final output with the actual target in the training data,
and measure the error using a loss function.
Perform a backwards pass from right to left and propagate the error
to every individual node using backpropagation. Calculate each
weight’s contribution to the error, and adjust the weights accordingly
using gradient descent. Propagate the error gradients back starting
from the last layer.
Perceptron Limitations
The perceptron can only model linearly separable functions
–Those functions which can be drawn in 2 dim graph and single straight line
separates values in two part
Boolean functions given below are linearly separable
–AND
– OR
–COMPLEMENT
It cannot model XOR function as it is non linearly separable
–When the two classes are not linearly separable, it may be desirable to obtain a
linear separator that minimizes the mean squared error
Multi layer feed-forward NN (FFNN)

FFNN is a more general network architecture, where there are hidden layers
between input and output layers.
Hidden nodes do not directly receive inputs nor send outputs to the external
environment.
FFNNs overcome the limitation of single-layer NN.
They can handle non-linearly separable learning tasks.
Multi layer feed-forward NN (FFNN)

The ANN for XOR has two hidden nodes that realizes this non- linear separation
and uses the sign (step) activation function.
Arrows from input nodes to two hidden nodes indicate the directions of the
weight vectors (1,-1) and (-1,1).
The output node is used to combine the outputs of the two hidden nodes.
FFNN Neuron Model
The classical learning algorithm of FFNN is based on the gradient
descent method.
For this reason the activation function used in FFNN are
continuous functions of the weights, differentiable everywhere.
The activation function for node i may be defined as a simple form
of the sigmoid function in the following manner
Feed Backward Architecture: BackPropogation
Algorithm

The Backpropagation algorithm learns in the same way as single


perceptron.
It searches for weight values that minimize the total error of the
network over the set of training examples (training set).
Backpropagation consists of the repeated application of the
following two passes:
Forward pass: In this step, the network is activated on one example
and the error of (each neuron of) the output layer is computed.
Backward pass: in this step the network error is used for updating
the weights. The error is propagated backwards from the output
layer through the network layer by layer. This is done by recursively
computing the local gradient of each neuron.
Feed Backward Architecture: BackPropogation
Algorithm
Backpropagation adjusts the weights of the NN in order to
minimize the network total mean squared error.
Consider a network of three layers.
Let us use i to represent nodes in input layer, j to represent nodes
in hidden layer and k represent nodes in output layer.
wij refers to weight of connection between a node in input layer
and node in hidden layer.
Feed Backward Architecture: BackPropogation
Algorithm
The following equation is used to derive the output value Yj of
node j

The error of output neuron k after the activation of the network on


the n-th training example (x(n), d(n)) is:
ek (n) = dk (n) − yk (n)
The network error is the sum of the squared errors of the output
neurons:
ek2 (n)
X
E(n) = (1)
The total mean squared error is the average of the network errors
of the training examples.
N
1X
Eav = E(n) (2)
N n=1
BackPropogation Weight Update Rule

The Backprop weight update rule is based on the gradient descent


method:
– It takes a step in the direction yielding the maximum decrease of
the network error E.
– This direction is the opposite of the gradient of E.
Iteration of the Backprop algorithm is usually terminated when the
sum of squares of errors of the output values for all training data in
an epoch is less than some threshold such as 0.01

Backprop is considered to have converged when the absolute rate


of change in the average squared error per epoch is sufficiently
small (in the range [0.1, 0.01]).
Difference Between Feed Forward And Feed
Backward Network

Feed Forward Network Feed Backward Network


Feed-forward neural networks en- Feedbackward networks can have
able signals to travel in one direction, signals traveling in both areas by
from inputto output learning loops on the web
There is no feedback (loops) i.e., the There are feed back loops
output of any layer does not affect
that same layer
Feed-forward networks influence to Feedback networks are very dy-
be easy networks that relate inputs namic and can get extremely com-
with outputs. They are extensively plex
used in pattern recognition
Output efficiency is less Output efficiency is more
Module 2: The AI Problems

The AI Problems
AI is developing with such an incredible speed, sometimes it seems
magical.
There is an opinion among researchers and developers that AI
could grow so immensely strong that it would be difficult for humans
to control.
Humans developed AI systems by introducing into them every
possible intelligence they could, for which the humans themselves
now seem threatened.
Module 2: The AI Problems
The AI Problems
Threat to Privacy: An AI program that recognizes speech and
understands natural language is theoretically capable of
understanding each conversation on e-mails and telephones.
Threat to Human Dignity: AI systems have replaced the human
beings in few industries. It should not replace people in the sectors
where they are holding dignified positions which are pertaining to
ethics such as nursing, surgeon, judge, police officer, etc.
Threat to Safety: The self-improving AI systems can become so
mighty than humans that could be very difficult to stop from
achieving their goals, which may lead to unintended consequences.
Bias and Discrimination: AI systems can perpetuate and amplify
human biases, leading to discriminatory outcomes.
Lack of Transparency: AI systems can be difficult to understand
and interpret, making it challenging to identify and address bias and
errors.
Regulation: There is a need for clear and effective regulation to
ensure the responsible development and deployment of AI.
The underlying assumption

The heart of underlying research lies in physical symbol system


hypothesis. The physical symbol system consist-
symbols: The set of entities that are physical patterns.
Symbol Structures-The number of instances/tokens related in some
physical way
Processes- It operates on these expressions to produce other
expressions
The Key Concepts of Physical Symbol System consist of
– Designation - the symbol/expressions can refer to something else.
– Interpretation - the expressions can refer to its own computational
processes which the system can evoke and execute
The underlying assumption
The underlying assumption

Thus we can conclude the underlying assumption of artificial


intelligence as
The Human thinking is a kind of symbol manipulation, symbol
manipulation necessary for intelligence
Machines can be intelligent, because Symbol manipulation is
sufficient for intelligence
Intelligence requires Understanding
– Does the program understands the symbols that it uses
– Does symbols have any meaning for machine
– Can Physical Symbol System be Intelligent
The underlying assumption

The importance of the physical symbol system hypothesis.


It is a significant theory of the nature of human intelligence and so if
great interest to psychologists.
It also forms the basis of the belief that it is possible to build
programs that can perform intelligent task as a human.
Hypothesis: a supposition or proposed explanation made on the
basis of limited evidence as a starting point for further investigation.
–the hypothesis that every event has a cause
There appears to be no way to prove or disprove on logical grounds
Thus it must be subjected to the emprical validation
We may find it false or bulk of evidence to be true, The only way to
determine its truth is by experimentation.
AI-Technique: Explaination with one Analogy of
Algorithm

AI problems span a very broad spectrum.


AI problems appear to have very little common, except that they
are very hard
One of the few hard and fast results to comeout of first three
decades is that it requires knowledge
We first consider Tic-Tac-Toe Problem
Tic-Tac-Toe Through MinMax Algorithm

The board used to play the Tic-Tac-Toe game consists of 9 cells


laid out in the form of a 3x3 matrix. The game is played by 2
players and either of them can start. Each of the two players is
assigned a unique symbol (generally 0 and X). Each player
alternately gets a turn to make a move.
The player who succeeds in placing three of their marks in
horizontal, vertical or digonal will win the game.
Aim is to build an unbeatble game of Tic-Tac-Toe
The most popular method is minmax Algorithm.
Create an Algorithm that calculates all possible moves for a
computer
Tic-Tac-Toe MinMax Algo Contd...

Create a metric to determine best possible move, The min max


algorithm works better
If we play a perfect game, and if I play i will either win or draw the
game. and if we play against a perfect player, we need to draw the
game. Play strategy is-
–For Example- let I be ’X’ and ’O’ be the opponent
– If I win I get +10 points(we may consider +∞)
– If I loose, I loose 10 points, say -10 (we may consider -∞)
–If I draw, I get zero points [Let say 0], i.e. nobody gets any point
So now we have a situation where we can determine a possible
score for any game end state
let’s take an example from near the end of a game, where it is my
turn. I am X. My goal here, obviously, is to maximize my end game
score.
Tic-Tac-Toe MinMax Algo Contd...
If the top of this image represents the state of the game I see
when it is my turn, then I have some choices to make, there are
three places I can play, one of which clearly results in me wining
and earning the 10 points. If I don’t make that move, O could very
easily win. And I don’t want O to win, so my goal here, as the first
player, should be to pick the maximum scoring move.
Tic-Tac-Toe MinMax Algo Contd...
What do we know about O? Well we should assume that O is also
playing to win this game, but relative to us, the first player, O
wants obviously wants to chose the move that results in the worst
score for us, it wants to pick a move that would minimize our
ultimate score. Let’s look at things from O’s perspective, starting
with the two other game states from above in which we don’t
immediately win, The choice is clear, O would pick any of the
moves that result in a score of -10.
Tic-Tac-Toe MinMax Algo Description

The key to the Minimax algorithm is a back and forth between the two players,
where the player whose ”turn it is“ desires to pick the move with the maximum
score. In turn, the scores for each of the available moves are determined by the
opposing player deciding which of its available moves has the minimum score.
And the scores for the opposing players moves are again determined by the
turn-taking player trying to maximize its score and so on all the way down the
move tree to an end state. A description for the algorithm, assuming X is the
"turn taking player," would look something like:
–If the game is over, return the score from X’s perspective.
–Otherwise get a list of new game states for every possible move
–Create a scores list
–For each of these states add the minimax result of that state to the scores list
–If it’s X’s turn, return the maximum score from the scores list
–If it’s O’s turn, return the minimum score from the scores list
–this algorithm is recursive, it flips back and forth between the players until a final
score is found.
Tic-Tac-Toe MinMax Algo Description
It’s X’s turn in state 1. X generates the states 2, 3, and 4 and calls minimax on those states.
State 2 pushes the score of +10 to state 1’s score list, because the game is in an end state.
State 3 and 4 are not in end states, so 3 generates states 5 and 6 and calls minimax on them, while state 4 generates
states 7 and 8 and calls minimax on them.
State 5 pushes a score of -10 onto state 3’s score list, while the same happens for state 7 which pushes a score of -10
onto state 4’s score list.
State 6 and 8 generate the only available moves, which are end states, and so both of them add the score of +10 to the
move lists of states 3 and 4.
Because it is O’s turn in both state 3 and 4, O will seek to find the minimum score, and given the choice between -10 and
+10, both states 3 and 4 will yield -10.
Finally the score list for states 2, 3, and 4 are populated with +10, -10 and -10 respectively, and state 1 seeking to
maximize the score will chose the winning move with score +10, state 2.
N Queens Problem using Backtracking without
Bounding Condition
N Queen problem demands us to place N queens on a N x N chessboard so that
no queen can attack any other queen directly.
Queen can attack other queen in horizontal row, Vertical coloumn or in digonal
direction.
Problem Statement: We need to find out all the possible arrangements in which
N queens can be seated in each row and each column so that all queens are
safe. The queen moves in any of these 8 directions and can directly attack other
queen in these 8 directions only.
N Queens Problem using Backtracking without
Bounding Condition
For Simplicity instead on N queen we consider 4 queen problem
with 4x4 chess board.
4 - Queen Problem: This problem demands us to put 4 queens on
4 X 4 chessboard in such a way that one queen is present in each
row and column and no queen can attack any other queen directly.
this means no 2 or more queens can be placed in the same
diagonal or row or column, total combinations 16C4.
N Queens Problem using Backtracking without
Bounding Condition

Queens will be under attack if they are placed in


– same row
— same column and
–same diagonal
So we need to find a solution that is it possible to arrange them on
4x4 chess board that they are not under attack.
It is possible to have solution and there could be more than one
solution. So to find that solution backtracking is used.
So total posibilties, starting from root node = 1+ 4+4x3+
4x3x2+4x3x2x1=65 General Formula= 1 + 3i=0 ij=0 (4 − j) = 65
P Q

This is for 4x4 chess board For 8x8 4 should be replaced by 8,


1 + 7i=0 ij=0 (8 − j) In general for NxN 1 + N
Qi
i=0 −1 j=0 (N − j)
P Q P
N Queens Problem using Backtracking without
Bounding Condition
The first queen i.e. Q1 can be put anywhere on the chessboard as
there is no other queen present on the board and hence no
restrictions. Therefore putting Q1 at position (0,0). So the path so
far is| (0,0)|.
When Q1 has been placed there are some places where the next
queens can’t be placed to fulfill given conditions. So to put queen
Q2 in the second row we have positions - (1,2) and (1,3). Let’s put
it at (1,2).The path so far is | (0,0) -> (1,2)|
Now this placement of Q2 blocks all the boxes of row 3 and hence
there is no way to put Q3. If we put it at (2,0) or (2,2), Q1 will
attack it, and at (2,1) and (2,3) Q2 attacks it. Therefore we
backtrack from here and revisit the previous solution by
readjusting the position of Q2. So instead of putting it at (1,2), we
put it at (1,3). The path so far is | (0,0) -> (1,3)|
We put Q3 at (2,1). Hence, the path so far is | (0,0) -> (1,3) ->
(2,1)|.
N Queens Problem using Backtracking without
Bounding Condition

Now again the same problem occurs, there left no box to place
Q4. There was only 1 way to place Q3 and all placements of Q2
have been explored, so now we come to Q1 for re-adjustment. We
move it from (0,0) to (0,1). The path so far is | (0,1)|.
N Queens Problem using Backtracking without
Bounding Condition
We put Q2 at (1,0). The path so far is | (0,1) -> (1,0)|. Q3 is put at (2,2). The
path so far is | (0,1) -> (1,0) -> (2,2)|. Now again there is no space left for
placement of Q4 in row 4. Therefore we again backtrack and readjust position of
Q2 from (1,0) to (1,3).The path so far is | (0,1) -> (1,3)|. Q3 is put at (2,0). The
path so far is | (0,1) -> (1,0) -> (2,0)|. We put Q4 at (3,2). The path so far is |
(0,1) -> (1,0) -> (2,0) -> (3,2)|. Therefore through backtracking, we reached a
solution where 4 queens are put in each row and column so that no queen is
attacking any other on a 4 X 4 chessboard.
4-queens Prob With Bounding Function State Space
Tree
Now we solve the problem with bounding function (which is a condition), that
Queens should not be in
– same row
— same column and
–same diagonal
Now, we place queen q1 in the very first acceptable position (1, 1). Next, we put
queen q2 so that both these queens do not attack each other. We find that if we
place q2 in column 1 and 2, then the dead end is encountered. Thus the first
acceptable position for q2 in column 3, i.e. (2, 3) but then no position is left for
placing queen ’q3’ safely. So we backtrack one step and place the queen ’q2’ in
(2, 4), the next best possible solution.
4-queens Prob With Bounding Function State Space
Tree
Then we obtain the position for placing ’q3’which is (3, 2). But later this position
also leads to a dead end, and no place is found where ’q4’ can be placed safely.
Then we have to backtrack till ’q1’and place it to (1, 2) and then all other queens
are placed safely by moving q2 to (2, 4), q3 to (3, 1) and q4 to (4, 3). That is, we
get the solution (2,4, 1, 3). This is one possible solution for the 4-queens
problem. For another possible solution, the whole method is repeated for all
partial solutions. The other solutions for 4 - queens problems is (3, 1, 4, 2) i.e.
State Space Search
The state space search is a process used in the field of computer
science, including artificial intelligence in which succsseive
configurations or states of an instance are considered with the
intension of finding a goal state with the desired property.
There are two types of problem solving in state space search:
Uninformed search and Informed Search
Uninfomed Search: or Brute-force algorithms, search through the
search space all possible candidates for the solution checking
whether each candidate satisfies the problem’s statement.These
are search problem wherein the user is not aware of any doamin
specific problem. For example- Breadth first search, search trees
and so on
Informed Search: These are search search problems wherein the
user is aware of domain specific problem, here we have function
to reach our goal. It uses heuristic functions that are specific to
the problem, apply them to guide the search through the search
space to try to reduce the amount of time spent in searching For
Heuristic Search Techniques: 8-Puzzle Problem
The purpose of heuristic search function is to guide the search in the most
profitable path or optimal path amongst the available path. For complex
problems, the traditional algorithms are unable to find the solutions within some
practical time and space limits. Consequently, many special techniques are
developed, using heuristic functions.
We can explain it by 8-puzzle problem
Heuristic Search Techniques: 8- Puzzle Problem

The purpose of heuristic search function is to guide the search in the most
profitable path or optimal path amongst the available path
Heuristic Search Techniques: 8- Puzzle Problem

The purpose of heuristic search function is to guide the search in the most
profitable path or optimal path amongst the available path
Heuristic Search Techniques: 8- Puzzle Problem

The purpose of heuristic search function is to guide the search in the most
profitable path or optimal path amongst the available path
Production Systems in AI

The Production system or production rule system is a computer program typically


used to provide solution to a particular problem which is connetcted to AI.

A production system is an AI program that consists of some rules and the


procedures or processes for following them. This set of rules can be termed as
’production’. They enhance action selection and automated planning.
Production Systems Components
It consists of following components-
A set of rules each consisting of a left side that determines the applicability of the
rule and a right side that describes the operation to be performed if the rule is
applied.
a set of rules, which are in the form of A → B each rule has a LHS that determines
its applicability or current problem state and a RHS that says what action is
performed or ouput state. These rules operates on global database.
A matching procedure that determines which rules can be applied to the current
state, A rule is applicable if LHS matches with the current problem state.
A production system is an artificial intelligence program that consists of some rules
and the procedures or processes for following them. This set of rules can be termed
as ’production’.They enhance action selection and automated planning.
The Global Database which contains all the information necessary to successfully
complete a task. It is further broken down into two parts: temporary and permanent.
The permanent part of the database consists of fixed actions, whereas the
temporary part alters according to circumstances.
The Control Strategy checks the applicability of a rule. The control strategy specifies
the sequence of rules that compares the condition from the global database to reach
the correct result. It helps to decide which rule should be applied and terminates the
process when the system gives the correct output. It also resolves the conflict of
multiple conditions arriving at the same time.
Architecture of Production Systems

Architecture of Production Systems

There are mainly four characteristics of the production system in


AI that is simplicity, modifiability, modularity, and
knowledge-intensive.
Characteristics of Production Systems
Simplicity: The production rule in AI is in the form of an ’IF-THEN’ statement.
Every rule in the production system has a unique structure. It helps represent
knowledge and reasoning in the simplest way possible to solve real-world
problems. Also, it helps improve the readability and understanding of the
production rules.
Modularity: The modularity of a production rule helps in its incremental
improvement as the production rule, knowledge can be in discrete parts. The
production rule is made from a collection of information and facts that may not
have dependencies unless there is a rule connecting them together. The
addition or deletion of single information will not have a major effect on the
output. Modularity helps enhance the performance of the production system by
adjusting the parameters of the rules.
Modifiability: The feature of modifiability helps alter the rules as per
requirements. Initially, the skeletal form of the production system is created. We
then gather the requirements and make changes in the raw structure of the
production system. This helps in the iterative improvement of the production
system.
Knowledge-intensive: Production systems contain knowledge in the form of a
human spoken language,i.e., English. It is not built using any programming
languages. The knowledge is represented in plain English sentences.
Production rules help make productive conclusions from these sentences.
Advantages of Production Systems

Offers modularity as all the rules can be added, deleted, or


modified individually.
Separate control system and knowledge base.
An excellent and feasible model that imitates human
problem-solving skills.
Beneficial in real-time applications and environment.
Offers language independence.
Disadvantages of Production Systems
Opacity: Communication between the rule interpreter and the
production rules creates difficulty for the understanding of the
control system and its strategies. There exist difficulties in
understanding the hierarchy of operations.
Inefficiency: There are various rules that we employ for solving a
problem. The rules can be effective in different ways. There are
conditions where multiple rules get activated during execution. All
the individual rules apply exhaustive searches in each cycle that
reduces the efficiency of the production system.
Inability to Learn: A simple production system based on certain
rules is not capable of learning through experience, unlike
advanced AI systems. They are simply bound to specific rules for
actions. We can understand the rules and break them.
Conflict Resolution: To satisfy a condition, various production rules
are employed. The condition may arise when there is a triggering
of more than one rule. In that condition, the control system has to
determine the best possible rule from the set of conflicting rules.
Water Jug Problem

Every search process can be viewed as a traversal of a tree


structure in which eachnode represents a problem state and each
arc represents a relationship between the states represented by
the nodes.
The major issues in the design of search program can be
explained with an anlogy of waterjug problem.
A Water Jug Problem: You are given two jugs, a 4-gallon one and
a 3-gallon one, a pump which has unlimited water which you can
use to fill the jug, and the ground on which water may be poured.
Neither jug has any measuring markings on it. How can you get
exactly 2 gallons of water in the 4-gallon jug?
Water Jug Problem

Here the initial state is (0, 0). The goal state is (2, n) for any value
of n.
State Space Representation: we will represent a state of the
problem as a tuple (x, y) where x represents the amount of water
in the 4-gallon jug and y represents the amount of water in the
3-gallon jug. Note that 0 ≤ x ≤4, and 0 ≤ y ≤ 3. hypothesis.
To solve this we have to make some assumptions not mentioned
in the problem. They are:
– We can fill a jug from the pump.
– We can pour water out of a jug to the ground.
– We can pour water from one jug to another.
– There is no measuring device available.
Operators- we must define a set of operators that will take us from
one state to another.
Production Rules for the Water Jug Problem in
Artificial Intelligence
To solve the water jug problem, many algorithms can be used.
These include:
Breadth-First Search: BFS or Breadth First Search visits the nodes
in order of their distance from the starting node. This implies that it
will visit the nearest node first.
Depth First Search: DFS or Depth First Search visits the nodes in
order of their depth. In production rules for the water jug problem,
let x denote a 4-litre jug, and y denote a 3-litre jug, i.e. x=0,1,2,3,4
or y=0,1,2,3
–Start state (0,0), Goal state (2,n) from any n
DFS is also known as Depth-First Search sometimes turns out to
be an incomplete algorithm and may end up abruptly, giving a
Non-Optimal Solution. The solution generated by the depth-first
search may be incomplete as this algorithm goes on to the deepest
possible point in the first path. In case, that first path leads to
infinity, the algorithm will turn into an infinite loop and will never give
any correct solution.
The Solution to the Water Jug Problem in Artificial
Intelligence
The Solution to the Water Jug Problem in Artificial
Intelligence

Current state (0,0), Loop till the goal state (2,0) is reached.
Apply a rule when the left side matches the current state, Set the
new current state to the resulting state
Start state (0,0) (0, 0)- Start State
(0,3) Apply Rule 2,Fill the 3-litre Jug (4, 0) - Rule 1: Fill the 4-litre jug
(3,0) Apply Rule 9: Pour all the water from a 3-litre jug (1, 3)- Rule 8: Pour water from the 4-litre jug into the
into a 4-litre jug 3-litre jug until the 3-litre jug is full.
(3,3) Apply Rule 2, Fill the 3-litre Jug (1, 0) - Rule 6: Empty the 3-litre jug on the ground
(4,2) Apply Rule 7: Pour water from a 3-litre jug into a (0, 1) -Rule 10: Pour all the water from the 4-litre jug
4-litre jug until it is full into the 3-litre jug.
(0,2) Apply Rule 5, Empty 4-litre jug on the ground (4, 1) - Rule 1: Fill the 4-litre jug
(2,0) Apply Rule 9: Pour all the water from a 3-litre jug (2, 3)- Rule 8: Pour water from the 4-litre jug into the
into a 4-litre jug 3-litre jug until the 3-litre jug is full.
Production Rules for Water Jug Problem
Search Graph Problem for Water Jug Problem

Graph search Problem are especially useful in dealing with


partially commutative production system in which a given set of
operatio will produce the same result regardless of the order in
which operations are applied
hence we formally describe the issues of problem by-
– Defining a state space that consists of all possible
configurationswith relevant objects –Specify one or more states
within the space that decribe all possible solutions
Search Graph Problem for Water Jug Problem

Graph search Problem are especially useful in dealing with


partially commutative production system in which a given set of
operatio will produce the same result regardless of the order in
which operations are applied
hence we formally describe the issues of problem by-
– Defining a state space that consists of all possible
configurationswith relevant objects –Specify one or more states
within the space that decribe all possible solutions
Major Issue in the Design of Search Programs
Each search process can be considered to be a tree traversal. The object of the search is
to find a path from the initial state to a goal state using a tree. The number of nodes
generated might be huge; and in practice many of the nodes would not be needed. The
secret of a good search routine is to generate only those nodes that are likely to be useful,
rather than having a precise tree. The rules are used to represent the tree implicitly and
only to create nodes explicitly if they are actually to be of use.
The direction in which to conduct search (forward versus backward reasoning). If the
search proceeds from start state towards a goal state, it is a forward search or
backwards search can also be performed from the goal state to the initial state.
How to select applicable rules (Matching). Production systems typically spend most
of their time looking for rules to apply. So, it is critical to have efficient procedures for
matching rules against states.
How to represent each node of the search process (knowledge representation
problem). This is the knowledge representation problem or the frame problem. In
games, an array suffices; in other problems, more complex data structures are
needed. Finally in terms of data structures, considering the water jug as a typical
problem do we use a graph or tree? The breadth-first structure does take note of all
nodes generated but the depth-first one can be modified.
–Frames are the AI data structure which divides knowledge into substructures by
representing stereotypes situations. It consists of a collection of slots and slot
values.
Heuristic search techniques: Search Algorithms in AI

In heuristic search techniques problems, the search proceeds using current information
about the problem to predict which path is closer to the goal and follow it, although it does
not always guarantee to find the best possible solution. Such techniques help in finding a
solution within reasonable time and space (memory).
Some prominent intelligent search algorithms are stated below:
Breadth-first search: A Search strategy, in which the highest layer of a decision
tree is searched completely before proceeding to the next layer is called Breadth-first
search (BFS).
– In this strategy, no viable solutions are omitted and therefore it is guaranteed that
an optimal solution is found.
–This strategy is often not feasible when the search space is large.
– Advantages: i) Guaranteed to find an optimal solution (in terms of shortest number
of steps to reach the goal).
–Advantages: ii) Can always find a goal node if one exists (complete).
–Disadvantages: High storage requirement: exponential with tree depth
Heuristic search techniques: Search Algorithms in AI
Cont...

Depth-first search: A search strategy that extends the current path as far as possible
before backtracking to the last choice point and trying the next alternative path is called
Depth-first search (DFS).
– This strategy does not guarantee that the optimal solution has been found.
– In this strategy, search reaches a satisfactory solution more rapidly than breadth first, an
advantage when the search space is large.
–Advantages: i) Low storage requirement: linear with tree depth.
–Advantages: ii)Easily programmed: function call stack does most of the work of
maintaining state of the search.
–Disadvantages: i) May find a sub-optimal solution (one that is deeper or more costly than
the best solution).
–Disadvantages: ii) Incomplete: without a depth bound, may not find a solution even if one
exists.

Bounded depth-first search: Depth-first search can spend much time (perhaps infinite
time) exploring a very deep path that does not contain a solution, when a shallow solution
exists. An easy way to solve this problem is to put a maximum depth bound on the search.
Beyond the depth bound, a failure is generated automatically without exploring any deeper.
Heuristic search techniques: Search Algorithms in AI
Cont...
Generate-And-Test Algorithm: Generate-and-test search algorithm is a very simple
algorithm that guarantees to find a solution if done systematically and there exists a
solution.
Algorithm:
–Generate a possible solution.
–Test to see if this is the expected solution.
–If the solution has been found quit else go to step 1.
Potential solutions that need to be generated vary depending on the kinds of problems. For
some problems the possible solutions may be particular points in the problem space and
for some problems, paths from the start state.
Generate-and-test, like depth-first search, requires that complete solutions be generated
for testing. In its most systematic form, it is only an exhaustive search of the problem
space. Solutions can also be generated randomly but solution is not guaranteed.
Heuristic search techniques: Search Algorithms in AI
Cont...

Systematic Generate-And-Test: While generating complete solutions and generating


random solutions are the two extremes there exists another approach that lies in between.
The approach is that the search process proceeds systematically but some paths that
unlikely to lead the solution are not considered. This evaluation is performed by a heuristic
function. It generates complete solution and then test.
Depth-first search tree with backtracking can be used to implement systematic
generate-and-test procedure. As per this procedure, if some intermediate states are likely
to appear often in the tree, it would be better to modify that procedure to traverse a graph
rather than a tree.
Algorithm:
–Generate a possible solution.
–Test to see if this is the expected solution.
–If the solution has been found stop –If incorrect solution repeat step 1
Heuristic search techniques: Search Algorithms in AI
Cont...
Best First Search (Informed Search): Best First Search falls under the category of
Heuristic Search or Informed Search.
Best First Search falls under the category of Heuristic Search or Informed Search. In BFS
and DFS, when we are at a node, we can consider any of the adjacent as next node. So
both BFS and DFS blindly explore paths without considering any cost function.
The idea of Best First Search is to use an evaluation function to decide which adjacent is
most promising and then explore.
We use a priority queue (FIFO) to store costs of nodes. So the implementation is a
variation of BFS, we just need to change Queue to PriorityQueue.
Algorithm: i) Create 2 empty lists: OPEN and CLOSED
ii) Start from the initial node (say N) and put it in the ’ordered’ OPEN list
iii) Repeat the next steps until the GOAL node is reached
–If the OPEN list is empty, then EXIT the loop returning ’False’
–Select the first/top node (say N) in the OPEN list and move it to the CLOSED list. Also,
capture the information of the parent node
–If N is a GOAL node, then move the node to the Closed list and exit the loop returning
’True’ The solution can be found by backtracking the path
–If N is not the GOAL node, expand node N to generate the ’immediate’ next nodes linked
to node N and add all those to the OPEN list
–Reorder the nodes in the OPEN list in ascending order according to an evaluation
function f(n)
Best First Search Example
Best First Search Example
We start from source ”S“ and search for goal ”I“ using given costs
and Best First search.
Create an empty PriorityQueue pq, Insert ”StartNote“ in pq.
pq initially contains S, We remove S from and process unvisited
neighbors of S to pq.
pq now contains {A, C, B} (C is put before B because C has lesser
cost)
We remove A from pq and process unvisited neighbors of A to pq,
pq now contains {C, B, E, D}
We remove C from pq and process unvisited neighbors of C to pq.
pq now contains {B, H, E, D}
We remove B from pq and process unvisited neighbors of B to pq.
pq now contains {H, E, D, F, G}
We remove H from pq. Since our goal ”I” is a neighbor of H, we
return.
Problem Reduction

Problem Redeuction search is planning, how best to solve a


problem that can be recursively decomposed into subproblems in
multiple ways.
Problem Reduction: Its like divide and conquer strategy, a solution
to a problem can be obtained by decomposing it into smaller
sub-problems.
Each of this sub-problem can then be solved separately and a
combination of these will be a solution, These sub-solutions can
then be recombined to get a solution as a whole.
AND-OR graphs or AND-OR trees are used for representing the
solution.
Problem Reduction
All these must be solved so that the arc will rise to many arcs, indicating several possible
solutions. Hence the graph is known as AND - OR instead of AND. Figure shows an AND
-OR graph.

If we are looking for a sequence of actions to achieve some goal, one way is to use state
space search where each node in state space search is a state of the world and we serach
for the sequence of actions which help us to reach from start state to goal state.
And-OR-Graph
AND-OR graph is useful for representing the solution of problems that can be solved by
decomposing them into a set of smaller problems all of which must be solved.
It uses a single structure G. G represents the part of the search graph generated so far.
Each node in G points down to its immediate successors and up to its immediate
predecessors, and also has with it the value of h’ cost of a path from itself to a set of
solution nodes.
Algorithm for AND-OR graph
–Step1: Initilize the graph to the starting node
–Step 2: Loop untill starting node is labelled or solved or untill cost leads to the optimal
path
–Step 3:Traverse the graph and set of nodes that are on path and have not been expanded
& labelled as solved
–Step 4:Pick one of these unexpanded nodes and expand it to compute F.
–Step 5:Change the f’ estimate of newly expanded node to reflect the new information by
its successor
Constraint Satisfaction Problem (CSP)

CSP is basically a problem in which we be solving a constraint in oder to satisfy a


particular domain.
Whenever we formulate a CSP, we need to identify the following: a set of variables X ={X1 ,
X2 , · · · , Xn }, set of conditions C ={C1 , C2 , · · · , Cn }, set of domains D, consists of Set of
allowable values for variable Xi , each possible Xi has a non-empty domain Di of possible
values. Minimum value =1.
The problem formulation in CSP includes-
–Step1: Initial State- We need to define initial state or the state from where we need to
solve the problem
–Step2: Successor Function: The Succssor Function is chosen such a way it should not
affect the next function that we need to provide in order to solve the problem.
–Step 3: Goal Test: Goal test is to check whether the CSP problem is completed or not.
–Step 4: The Path Cost: is the cost of completed path, its a value and minimum path cost
is always 1.
Constraint Satisfaction Problem (CSP)
Given map, we need to color each region with RGB such that no neighbouring region has
same color
Variables: WA, NT, Q, NSW, V, SA, T
Domain: {red, green, blue}
Constraints: adjacent regions must have different colors
–Explicit Constraint Example: (WA, NT) ∈ { (red, green), (red, blue), (blue, green), (blue,
red), (green, red), (green, blue) }
–Implicit Constraint Example: WA ̸= NT
–Solution: assignments satisfying all constraints
–Example: { WA=red, NT=green, Q=red, NSW=greeen, V=red, SA=blue, T=green}
Constraint Satisfaction Problem (CSP)
Given map, we need to color each region with RGB such that no neighbouring region has
same color
Variables: WA, NT, Q, NSW, V, SA, T
Domain: {red, green, blue}
Constraints: adjacent regions must have different colors
–Explicit Constraint Example: (WA, NT) ∈ { (red, green), (red, blue), (blue, green), (blue,
red), (green, red), (green, blue) }
–Implicit Constraint Example: WA ̸= NT
–Solution: assignments satisfying all constraints
–Example: { WA=red, NT=green, Q=red, NSW=greeen, V=red, SA=blue, T=green}
Means-Ends-Analysis

Means-Ends-Analysis- Most of the search strategies either reason forward or backward


however, often a mixture o the two directions is appropriate. Such mixed strategy would
make it possible to solve the major parts of problem first and then go back and solve the
smaller problems that arise while assembling the final solution. Such a technique is called
“Means - Ends Analysis”.
The means-end analysis is a special type of knowledge-rich search that allows both
backward and forward searching.
The means -ends analysis process centers around finding the difference between current
state and goal state. The problem space of means - ends analysis has an initial state and
one or more goal state, a set of operate with a set of preconditions their application and
difference functions that computes the difference between two state a(i) and s(j).
A problem is solved using means- ends analysis by
–1.Computing the current state s1 to a goal state s2 and computing their difference D12 =
S2- S1.
– 2.Satisfy the preconditions for some recommended operator op is selected, then to
reduce the difference D12.
–3. The operator OP is applied if possible. If not the current state is solved a goal is
created and means-ends analysis is applied recursively to reduce the sub goal.
–4. If the sub goal is solved state is restored and work resumed on the original problem.
Means-Ends-Analysis
We know the initial state and goal state as given below, In this problem, we need to get the goal state by finding
differences between the initial state and goal state and applying operators.
To solve the above problem, we will first find the differences between initial states and goal states, and for each
difference, we will generate a new state and will apply the operators. The operators we have for this problem are: Move,
Delete, Expand
Step1: Evaluating the initial state- In the first step, we will evaluate the initial state and will compare the initial and Goal
state to find the differences between both states.
Step2: Applying Delete operator- As we can check the first difference is that in goal state there is no dot symbol which is
present in the initial state, so, first we will apply the Delete operator to remove this dot.
Step3: Applying Move Operator- After applying the Delete operator, the new state occurs which we will again compare
with goal state. After comparing these states, there is another difference that is the square is outside the circle, so, we
will apply the Move Operator.
Step4: Applying Expand Operator- Now a new state is generated in the third step, and we will compare this state with the
goal state. After comparing the states there is still one difference which is the size of the square, so, we will apply Expand
operator, and finally, it will generate the goal state.
Module -3: Knowledge Representation Issues
For the purpose of solving complex problems countered/encountered in AI, we need both a
large amount of knowledge and some mechanism for manipulating that knowledge to
create solutions to new problems.
A variety of ways of representing knowledge (facts) have been exploited in AI programs. In
all variety of knowledge representations , we deal with two kinds of entities.
(i)Facts: Truths in some relevant world. These are the things we want to represent.
(ii) Representations of facts in some chosen formalism . these are things we will actually
be able to manipulate.
One way to think of structuring these entities is at two levels : (a) the knowledge level, at
which facts are described, and (b) the symbol level, at which representations of objects at
the knowledge level are defined in terms of symbols that can be manipulated by programs.
The facts and representations are linked with two-way mappings. This link is called
representation mappings. The forward representation mapping maps from facts to
representations. The backward representation mapping goes the other way, from
representations to facts.

One common representation is natural language (particularly English) sentences.


Regardless of the representation for facts we use in a program , we may also need to be
concerned with an English representation of those facts in order to facilitate getting
information into and out of the system. We need mapping functions from English
sentences to the representation we actually use and from it back to sentences.
Module -3: Properties and Applications of Knowledge
Representation

Good Knowledge representation should exhibit following Properties-


Representational adequacy- A major property of a knowledge representation system
is that it is adequate and can make an AI system understand, i.e., represent all the
knowledge required by it to deal with a particular field or domain.
Inferential adequacy- Ability to manipulate representational structures such that new
knowledge can be derived/inferred from the old.
Inferential efficiency- Ability to incorporate additional information into an existing
knowledge base that can be used to focus the attention of inference mechanisms in
the most promising direction.
Acquisitional Efficiency-Ability to easily acquire new information automatically,
helping the AI to add to its current knowledge and consequently become
increasingly smarter and productive rather than reliance on human intervention.
Applications of Knowledge Representation-
– Learning: Acquiring knowledge, this is more than simply adding new facts to a
knowledge base. Duplication needs to be avoided.
– Resoning: Infer facts from existing data i.e. by utilizing knowledge. e.g. can Rohan play
an Instrument well, requires knowledge
Module -3: Representations and Mappings
In order to solve complex problems encountered in artificial intelligence, one needs both a
large amount of knowledge and some mechanism for manipulating that knowledge to
create solutions.
Knowledge and Representation are two distinct entities. They play central but
distinguishable roles in the intelligent system.
Knowledge is a description of the world. It determines a system’s capability by what it
knows.
Moreover, Representation is the way knowledge is encoded. It defines a system’s
performance in doing something.
Different types of knowledge require different kinds of representation.
Module -3: Mapping between Facts and
Representations

The Knowledge Representation models/mechanisms are often based on:


(i)Logic
(ii)Rules
(iii)Frames
(iv)Semantic Net
Logical Presentations
Logic- It is the most basic form of representing knowledge to machines where a
well-defined syntax with proper rules is used. This syntax needs to have no ambiguity in its
meaning and must deal with prepositions. Logical Representation can be of two types-
–Propositional Logic: This type of logical representation is also known as propositional
calculus or statement logic. This works in a Boolean, i.e., True or False method.
–First-order Logic: This type of logical representation is also known as the First Order
Predicate Calculus Logic (FOPL). This logical representation represents the objects in
quantifiers and predicates and is an advanced version of propositional logic.

This form of representation is the basis of most of the programming languages we know of
where we use semantics to convey information, and this form is highly logical.
However, the downside of this method is that due to the strict nature of representation
Semantic Networks
In this form, a graphical representation conveys how the objects are connected and are
often used with a data network. The Semantic networks consist of node/block (the objects)
and arcs/edges (the connections) that explain how the objects are connected.
This form of representation is also known as an alternative to the FPOL form of
representation. The relationships found in the Semantic Networks can be of two types -
IS-A and instance (KIND-OF). This form of representation is more natural than logical.
It is simple to understand however suffers from being computationally expensive and do
not have the equivalent of quantifiers found in the logical representation.
Prodection Rules
It is simple if-else rule-based system and, in a way, is the combination of Propositional and
FOPL logics.
This system comprises a set of production rules, rule applier, working memory, and a
recognize act cycle.
For every input, conditions are checked from the set of a production rule, and upon finding
a suitable rule, an action is committed.
This cycle of selecting the rule based on some conditions and consequently acting to solve
the problem is known as a recognition and act cycle, which takes place for every input.
This method has certain problems, such as the lack of gaining experience as it doesn’t
store the past results and can also be inefficient as, during execution, many other rules
may be active.
Frame Representation

At a fundamental level, frames can be imagine as a table having column names and values
in rows and information being passed in this structure.
However, the proper understanding is that it is a collection of attributes and values linked to
it. This AI-specific data structure uses slots and fillers (i.e., slot values, which can be of any
data type and shape).
It has a similar concept to how information is stored in a typical DBMS. These slots and
fillers form a structure - a frame. The slots here have the name (attributes), and knowledge
related to it is stored in the fillers.
The biggest advantage of this form of representation is that due to its structure, similar
data can be combined in groups as frame representation can divide the knowledge in
structures and then further into sub-structures.
Also, being like any typical data structure can be understood, visualized, manipulated
easily, and typical concepts such as adding, removing, deleting slots can be done
effortlessly.
Implicit and Explicit Knowledge

Knowledge is categorized into two major types:


Tacit corresponds to “informal” or ”implicit“ – Exists within a human being;
– It is embodied.
– Difficult to articulate formally.
– Difficult to communicate or share.
– Moreover, Hard to steal or copy.
– Drawn from experience, action, subjective insight
Explicit formal type of knowledge, Explicit
– Explicit knowledge – Exists outside a human being;
–It is embedded.
– Can be articulated formally.
– Also, Can be shared, copied, processed and stored.
– So, Easy to steal or copy
– Drawn from the artifact of some type as a principle, procedure, process, concepts.
Knowledge Representation Framework

A variety of ways of representing knowledge have been exploited in AI programs.


There are two different kinds of entities, we are dealing with
–Facts: Truth in some relevant world. Things we want to represent.
–Representation of facts in some chosen formalism. Things we will actually be able to
manipulate
These entities structured at two levels:
–The knowledge level, at which facts described.
– The symbol level, at which representation of objects defined in terms of symbols that
can manipulate by programs
Knowledge Representation Framework
The computer requires a well-defined problem description to process and provide a well-
defined acceptable solution.
Moreover, To collect fragments of knowledge we need first to formulate a description in our
spoken language and then represent it in formal language so that computer can
understand.
Also, The computer can then use an algorithm to compute an answer.
This over all process illustrated as Knowledge Representation Framework
Knowledge Representation Framework Steps

The informal formalism of the problem takes place first.


It then represented formally and the computer produces an output.
This output can then represented in an informally described solution that user understands
or checks for consistency.
The Problem solving requires,
– Formal knowledge representation, and
– Moreover, Conversion of informal knowledge to a formal knowledge that is the
conversion of implicit knowledge to explicit knowledge.
Mapping between Facts and Representation

Knowledge is a collection of facts from some domain.


Also, We need a representation of ”facts” that can manipulate by a program.
Moreover, Normal English is insufficient, too hard currently for a computer program to draw
inferences in natural languages.

Thus some symbolic representation is necessary. A good knowledge representation


enables fast and accurate access to knowledge and understanding of the content.
Knowledge Representation

Knowledge Representation Properties


Representational Adequacy: The ability to represent all kinds of knowledge that are
needed in that domain.
Inferential Adequacy: Also, The ability to manipulate the representational structures to
derive new structures corresponding to new knowledge inferred from old.
Inferential Efficiency: The ability to incorporate additional information into the knowledge
structure that can be used to focus the attention of the inference mechanisms in the most
promising direction.

Acquisitional Efficiency: Moreover, The ability to acquire new knowledge using automatic
methods wherever possible rather than reliance on human intervention.

Knowledge Representation Schemes


Relational Knowledge
Inheritable Knowledge
Inferential Knowledge

Procedural Knowledge
Knowledge Representation Schemes: Relational
Knowledge
The simplest way to represent declarative facts is a set of relations of the same sort used
in the database system.
Provides a framework to compare two objects based on equivalent attributes. Any instance
in which two different objects are compared is a relational type of knowledge.
The table below shows a simple way to store facts.
– Also, The facts about a set of objects are put systematically in columns.
–This representation provides little opportunity for inference.
Given the facts, it is not possible to answer a simple question such as: “Who is the
heaviest player?”
Also, But if a procedure for finding the heaviest player is provided, then these facts will
enable that procedure to compute an answer.
Moreover, We can ask things like who “bats -left” and “throws - right”.
Knowledge Representation Schemes: Inheritable
Knowledge

Here the knowledge elements inherit attributes from their parents, hierarchical structure.
The knowledge embodied in the design hierarchies found in the functional, physical and
process domains.
Within the hierarchy, elements inherit attributes from their parents, but in many cases, not
all attributes of the parent elements prescribed to the child elements.
Also, The inheritance is a powerful form of inference, but not adequate. Moreover, The
basic KR (Knowledge Representation) needs to augment with inference mechanism.
Property inheritance: The objects or elements of specific classes inherit attributes and
values from more general classes.
So, The classes organized in a generalized hierarchy.
Knowledge Representation Schemes: Inheritable
Knowledge Cont..

Boxed nodes - objects and values of attributes of objects.


Arrows - the point from object to its value.
This structure is known as a slot and filler structure, semantic network or a collection of
frames.
The steps to retrieve a value for an attribute of an instance object:
1. Find the object in the knowledge base
2. If there is a value for the attribute report it
3. Otherwise look for a value of an instance, if none fail
4. Also, Go to that node and find a value for the attribute and then report it
5. Otherwise, search through using is until a value is found for the attribute.
Knowledge Representation Schemes: Inferential
Knowledge

This knowledge generates new information from the given information.


This new information does not require further data gathering form source but does require
analysis of the given information to generate new knowledge.
Example: given a set of relations and values, one may infer other values or relations. A
predicate logic (a mathematical deduction) used to infer from a set of attributes.
Moreover, Inference through predicate logic uses a set of logical operations to relate
individual data.
Represent knowledge as formal logic: All dogs have tails forall x: dog(x) -> hastail(x) –
Advantages:
–A set of strict rules.
–Can use to derive more facts.
–Also, Truths of new statements can be verified.
–Guaranteed correctness.
–So, Many inference procedures available to implement standard rules of logic popular in
AI systems. e.g Automated theorem proving.
Knowledge Representation Schemes: Procedural
Knowledge

A representation in which the control information, to use the knowledge, embedded in the
knowledge itself. For example, computer programs, directions, and recipes; these indicate
specific use or implementation;
Moreover, Knowledge encoded in some procedures, small programs that know how to do
specific things, how to proceed.
Advantages:
– Heuristic or domain-specific knowledge can represent.
–Moreover, Extended logical inferences, such as default reasoning facilitated.
–Also, Side effects of actions may model. Some rules may become false in time.
–Keeping track of this in large systems may be tricky.
Disadvantages:
–Completeness - not all cases may represent.
–Consistency - not all deductions may be correct. e.g If we know that Fred is a bird we
might deduce that Fred can fly. Later we might discover that Fred is an emu.
–Modularity sacrificed. Changes in knowledge base might have far-reaching effects.
Cumbersome control information.
Logic Representation: Representation of Simple Facts
in Logic
AI system need to represent knowledge. Two types of knowledge representation:
–Propositional Logic
–Predicate Logic (First Order Predicate Logic)

Propositional Logic
Propositional logic is one of the fairly good forms of representing the same because it is
simple to deal with and a decision procedure for it exists. Real-world facts are represented
as logical propositions and are written as well-formed formulas in propositional logic exists.
Also, In order to draw conclusions, facts are represented in a more convenient way as-
1. Marcus is a man,
represented as- man(Marcus)
2. Plato is a man
represented as- man(Plato)
3. All men are mortal.
represented as- mortal(men)

But propositional logic fails to capture the relationship between an individual being a man
and that individual being a mortal.
Propositional Logic

Propositional logic is a formal language that uses symbols to represent propositions and
logical connectives to combine them. The symbols used in propositional logic include
letters such as p, q, and r, which represent propositions, and logical connectives such as ∧
(conjunction), ∨ (disjunction), and (negation), which are used to combine propositions.
An statement is a proposition if it is either true or false. Examples of propositions include
“2+2=4,“ and “The sky is blue.“
Logical connectives are used to combine propositions to form more complex statements.
Truth tables are used to represent the truth values of propositions and the logical
connectives that combine them.

In propositional logic, inference rules are used to derive new propositions from existing
ones. Propositional logic is a limited form of logic that only deals with propositions that are
either true or false.
Summarized table for Propositional Logic Connectives

Connective Name Meaning Example


¬ Negation “Not“ ¬p (”not p“) means ”it is not the case that
p“
∧ Conjunction ”And“ p ∧ q (”p and q“) means ”both p and q are
true”
∨ Disjunction “Or“ p ∨ q (”p or q”) means “either p or q is
true (or both)”
⊕ Exclusive “Exclusive p ⊕ q (“p xor q”) means “either p or q is
Disjunc- Or” true, but not both”
tion
→ Implication “If...then” p → q (“if p then q”) means “if p is true,
then q must be true”
↔ Bi- “If and p ↔ q (“p iff q”) means “p is true if and
implication only if” only if q is true“
Propositional Logic: Properties of Operators

In propositional logic, operators are used to combine propositions to form more complex propositions. Here are some of
the properties of operators commonly used in propositional logic: Suppose Set 1: A = {2, 4, 6} B = {3, 6, 9}
Commutativity: The commutative property states that the order in which the operators are applied does not affect the
result. For example, the union of A and B is A ∪ B = {2, 3, 4, 6, 9}, which is the same as the union of B and A, i.e., B ∪ A
= {2, 3, 4, 6, 9}. Therefore, the union operator is commutative.
Associativity: The associative property states that the grouping of operators does not affect the result. For example, the
intersection of A, B, and C is (A ∩ B) ∩ C = {6} ∩ {3, 6, 9} = {6} and A ∩ (B ∩ C) = {2, 4, 6} ∩ {3, 6, 9} = {6}. Therefore,
the intersection operator is associative.
Identity element: The identity element property states that there is an element that can be combined with any other
element using the operator without changing the result. For example, the intersection of A and the universal set U is A ?
U = {2, 4, 6}. Therefore, the universal set is the identity element of the intersection operator.
Distributive: The distributive property states that one operator can be distributed over the other operator. For example,
the intersection operator is distributive over the union operator. That is, A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C). For instance, A
∩ (B ∪ C) = {2, 4, 6} ∩ {3, 6, 7} = {6}, and (A ∩ B) ∪ (A ∩ C) = {2, 4, 6} ∩ {6} ∪ {2, 4, 6} ∩ {3, 7} = {6}.
De Morgan’s Law: De Morgan’s Law states that the complement of the union of two sets A and B is equal to the
intersection of the complement of A and the complement of B. That is, (A ∪ B)’ = A’ ∩ B’. For example, (A ∪ B)’ = {2, 3,
4, 6, 9}’= {1, 5, 7, 8} and A’ ∩ B’ = {1, 3, 5, 7, 8} ∩ {1, 2, 4, 5, 7, 8} = {1, 5, 7, 8}. Therefore, De Morgan’s Law holds.
Double-negation elimination: Double negation elimination is a valid rule of replacement that states that if not not-A is true,
then A is true. For example, if A = {2, 4, 6}, then A”= {2, 4, 6}, which is equivalent to A. Therefore, double negation
elimination holds.
Limitations of Propositional Logic

Limited expressivity: Propositional logic is limited in its ability to represent complex


relationships between objects or concepts. It can only express simple propositional
statements with binary truth values (true/false). This makes it difficult to represent
concepts such as uncertainty, ambiguity, and vagueness.
Inability to handle quantifiers: Propositional logic is unable to handle quantifiers such as
“all” or “some.” For example, it cannot represent the statement "all humans are mortal" in a
concise manner. This makes it difficult to reason about sets of objects or concepts.
Lack of support for negation: Propositional logic does not provide an easy way to represent
negation. This can make it difficult to represent negative statements and reason about
them.
Difficulty with recursive structures: Propositional logic struggles to represent recursive
structures, such as lists or trees, which are common in many AI applications. Recursive
structures require a more expressive language, such as first-order logic or higher-order
logic.

Inability to handle temporal relationships: Propositional logic is not well-suited for


representing temporal relationships between events or states. It cannot represent concepts
such as causality or temporal precedence, which are crucial in many AI applications.
First Order Predicate logic

First-order Predicate logic (FOPL) models the world in terms of -


–Objects, which are things with individual identities
–Properties of objects that distinguish them from other objects
–Relations that hold among sets of objects
– Functions, which are a subset of relations where there is only one “value” for any given
“input”
First-order Predicate logic (FOPL) provides–
– Constants: a, b, dog33. Name a specific object.
– Variables: X, Y. Refer to an object without naming it
– Functions: Mapping from objects to objects.
– Terms: Refer to objects
– Atomic Sentences: in(dad-of(X), food6) Can be true or false, Correspond to propositional
symbols P, Q.
–A well-formed formula (wff) is a sentence containing no “free“ variables. So, That is, all
variables are ”bound” by universal or existential quantifiers.
(∀ x)P(x, y) has x bound as a universally quantified variable, but y is free.
First Order Predicate logic

Basic Elements of First-order logic

Constant 1, 2, A, John, Mumbai, cat,....


Variables x, y, z, a, b,....
Predicates Brother, Father, ,....
Function sqrt, LeftLegOf, ....
Connectives ∧,∨, ¬, ⇔⇒
Equality ==
Quantifier ∀,∃
First Order Predicate logic
Quantifiers
Universal quantification
– (∀ x)P(x) means that P holds ∀ values of x in the domain associated with that variable
– E.g., (∀ x) dolphin(x) → mammal(x)
Existential quantification
– (∃ x)P(x) means that P holds for some value of x in the domain associated with that variable
– E.g., (∃ x) mammal(x) ∧ lays-eggs(x)

Example that shows the use of predicate logic as a The facts described by these sentences can be rep-
way of representing knowledge. resented as a set of well-formed formulas (wffs)
1. Marcus was a man. 1. man(Marcus)

2. Marcus was a Pompeian. 2. Pompeian(Marcus)

3. All Pompeians were Romans. 3. ∀ x: Pompeian(x) → Roman(x)

4. Caesar was a ruler. 4. ruler(Caesar)

5. Also, All Pompeians were either loyal to Caesar or 5. inclusive-or ∀ x: Roman(x) → loyalto(x, Caesar) ∨
hated him. hate(x, Caesar)
exclusive-or ∀ x: Roman(x) → (loyalto(x, Caesar) ∧
¬ hate(x, Caesar)) ∨ (¬loyalto(x, Caesar) ∧ hate(x,
Caesar))

6. Everyone is loyal to someone. ∀ x: ∃ y: loyalto(x, y)

7. People only try to assassinate rulers they are not ∀ x: ∀y: person(x) ∧ ruler(y) ∧ tryassassinate(x, y) →
loyal to. ¬ loyalto(x, y)

8. Marcus tried to assassinate Caesar. tryassassinate(Marcus, Caesar)


First Order Predicate logic Cont..
Now suppose if we want to use these statements to answer the question: Was Marcus
loyal to Caesar?
To produce a formal proof, reasoning backward from the desired goal: ¬ Ioyalto(Marcus,
Caesar)
In order to prove the goal, we need to use the rules of inference to transform it into another
goal (or possibly a set of goals) that can, in turn, transformed, and so on, until there are no
unsatisfied goals remaining.

The problem is that, although we know that Marcus was a man, we do not have any way to
conclude from that that Marcus was a person.
Also, We need to add the representation of another fact to our system, namely: ∀ man(x)
→ person(x)

Now we can satisfy the last goal and produce a proof that Marcus was not loyal to Caesar.
Issues which needs to be AddressedFirst Order
Predicate logic Cont..

From the above example, we see that three important issues must be addressed in the
process of converting English sentences into logical statements and then using those
statements to deduce new ones:
Many English sentences are ambiguous (for example, 5, 6, and 7 above). Choosing
the correct interpretation may be difficult.

Also, There is often a choice of how to represent the knowledge. Simple


representations are desirable, but they may exclude certain kinds of reasoning.

Similalry, Even in very simple situations, a set of sentences is unlikely to contain all
the information necessary to reason about the topic at hand. In order to be able to
use a set of statements effectively. Moreover, It is usually necessary to have access
to another set of statements that represent facts that people consider too obvious to
mention.
Difference between Propositional Logic and Predicate
Logic

Basis Propositional Logic Predicate logic


Definition Propositional logic consists of a declarative state- Predicate logic consists of a predicate that gives
ment represented as symbol and with a truth further information about a sentence’s subject. It
value, i.e., true or false, and can never be both consists of constants, variables functions and re-
simultaneously. lationship. It can be referred to as an attribute that
“All birds can fly“ cannot be represented in propo- determines the properties of the subject in a sen-
sitional logic, because it involves a variable (the tence.
birds) and a quantifier (all). it can be represented in predicate logic as ”∀x
(Bird(x) → CanFly(x))“
Variables Propositional Logic does not consist of variables. Variables are present.
Logical con- The logical connectives in propositional logic are Logical connectives in predicate logic are the
nectives AND, OR, NOT, IF-THEN, IF-AND-ONLY-IF. same as preposition logic and also contain propo-
sitional plus quantifiers.
Scope analy- Scope analysis is not performed in propositional Quantifiers are used to perform scope analysis in
sis logic. It can not express generalization, special- predicate logic, such as Universal Quantifiers, Ex-
ization or pattern like traingle has 3 sides. istential Quantifiers, Uniqueness Quatifiers, etc. It
can express generalization, specialization or pat-
tern, E.g. no.-of-sides(traingle= 3)
Representation Propositional logic is a generalized representa- Predicate logic is a specialized representa-
tion. It can represent as entity E.g. Meera is short tion. It can represent Individual Properties .E.g.-
short(meera)
Truth value In proposition logic, a proposition has a truth In predicate logic, the truth value depends on the
value, i.e., true or false. value of the variable,
Use case Propositional logic is used for analyisng simple Predicate logic is used for expressing complex
logical connections. connections ad decisions for a given variable.
Representing Instance and ISA Relationships
Specific attributes instance and isa play an important role particularly in a useful form of reasoning called property
inheritance.
The predicates instance and isa explicitly captured the relationships they used to express, namely class membership and
class inclusion.
The following Figure shows the first five sentences of the last section represented in logic in three different ways.
The first part of the figure contains the representations we have already discussed. In these representations, class
membership represented with unary predicates (such as Roman), each of which corresponds to a class.
Asserting that P(x) is true is equivalent to asserting that x is an instance (or element) of P. The second part of the figure
contains representations that use the instance predicate explicitly.
Representing Instance and ISA Relationships

The predicate instance is a binary one, whose first argument is an object and whose second argument is a class to which
the object belongs.
Instance used to show class inclusion e.g. isa(mega_star,rich)
Instance used to show class membership e.g., Instance(prince,mega_star)
But these representations do not use an explicit isa predicate.
Instead, subclass relationships, such as that between Pompeians and Romans, described as shown in sentence 3.
The implication rule states that if an object is an instance of the subclass Pompeian then it is an instance of the
superclass Roman.
Note that this rule is equivalent to the standard set-theoretic definition of the subclass- superclass relationship.
The third part contains representations that use both the instance and isa predicates explicitly.

The use of the isa predicate simplifies the representation of sentence 3, but it requires that one additional axiom (shown

here as number 6) be provided.


Computable Functions and Predicates

To express simple facts, such as the following greater-than and less-than relationships: gt(1,O) It(0,1) gt(2,1) It(1,2)
gt(3,2) It( 2,3)
It is often also useful to have computable functions as well as computable predicates. Thus we might want to be able to
evaluate the truth of gt(2 + 3,1)
To do so requires that we first compute the value of the plus function given the arguments 2 and 3, and then send the
arguments 5 and 1 to gt.
Consider the following set of facts, again involving Marcus:
1) Marcus was a man.
man(Marcus)
2) Marcus was a Pompeian.
Pompeian(Marcus)
3) Marcus was born in 40 A.D.
born(Marcus, 40)
4) All men are mortal.
x: man(x) → mortal(x)
5) All Pompeians died when the volcano erupted in 79 A.D.
erupted(volcano, 79) ∧ ∀ x : [Pompeian(x) → died(x, 79)]
6) No mortal lives longer than 150 years.
x: t1: At2: mortal(x) ∧ born(x, t1) ∧ gt(t2 - t1,150) → died(x, t2)
7) It is now 1991.
now = 1991
Computable Functions and Predicates

So, Above example shows how these ideas of computable functions and predicates can be useful. It also makes use of
the notion of equality and allows equal objects to be substituted for each other whenever it appears helpful to do so
during a proof.
So, Now suppose we want to answer the question ”Is Marcus alive¿‘
The statements suggested here, there may be two ways of deducing an answer.
Either we can show that Marcus is dead because he was killed by the volcano or we can show that he must be dead
because he would otherwise be more than 150 years old, which we know is not possible.
Also, As soon as we attempt to follow either of those paths rigorously, however, we discover, just as we did in the last
example, that we need some additional knowledge. For example, our statements talk about dying, but they say nothing
that relates to being alive, which is what the question is asking. So we add the following facts:
8) Alive means not dead.
x: t: [alive(x, t) → ¬ dead(x, t)] [ ¬dead(x, t) → alive(x, t)]

9) If someone dies, then he is dead at all later times. x: t1: At2: died(x, t1) ∧ gt(t2, t1)→ dead(x, t2)
Resolution: Resolution in Predicate Logic

The resolution algorithm for predicate logic as follows, assuming a set of given statements F and a statement to be
proved P:

Algorithm: Resolution
1. Convert all the statements of F to clause form.
2. Negate P and convert the result to clause form. Add it to the set of clauses obtained in 1.
–In logic, a clause is a propositional formula formed from a finite collection of literals (atoms or their negations)
and logical connectives.
3. Repeat until a contradiction found, no progress can make, or a predetermined amount of effort has expanded.
–1. Select two clauses. Call these the parent clauses.
–2. Resolve them together. The resolvent will the disjunction of all the literals of both parent clauses with
appropriate substitutions performed and with the following exception: If there is one pair of literals T1 and ¬T2
such that one of the parent clauses contains T2 and the other contains T1 and if T1 and T2 are unifiable, then
neither T1 nor T2 should appear in the resolvent. We call T1 and T2 Complementary literals. Use the substitution
produced by the unification to create the resolvent. If there is more than one pair of complementary literals, only
one pair should omit from the resolvent.
–3. If the resolvent is an empty clause, then a contradiction has found. Moreover, If it is not, then add it to the set
of classes available to the procedure.
Resolution Procedure

Resolution is a procedure, which gains its efficiency from the fact that it operates on statements that have been converted
to a very convenient standard form.
Resolution produces proofs by refutation (the action of proving a statement or theory to be wrong or false).
In other words, to prove a statement (i.e., to show that it is valid), resolution attempts to show that the negation of the
statement produces a contradiction with the known statements (i.e., that it is unsatisfiable).

The resolution procedure is a simple iterative process: at each step, two clauses, called the parent clauses, are
compared (resolved), resulting in a new clause that has inferred from them. The new clause represents ways that the two

parent clauses interact with each other. Suppose that there are two clauses in the system:
Winter ∨ Summer; ¬Winter ∨ Cold
Now we observe that precisely one of winter and ¬ winter will be true at any point.
If winter is true, then cold must be true to guarantee the truth of the second clause. If ¬ winter is true, then
summer must be true to guarantee the truth of the first clause.
Thus we see that from these two clauses we can deduce summer V cold
This is the deduction that the resolution procedure will make.
Resolution operates by taking two clauses that each contains the same literal, in this example, winter.
Moreover, The literal must occur in the positive form in one clause and in negative form in the other. The
resolvent obtained by combining all of the literals of the two parent clauses except the ones that cancel.
If the clause that produced is the empty clause, then a contradiction has found. For example, the two clauses
winter, and ¬ winter will produce the empty clause.

You might also like