Unit V

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

ONTOLOGICAL ENGINEERING:

Ontologies are formal definitions of vocabularies that allow us to define difficult or complex
structures and new relationships between vocabulary terms and members of classes that we
define. Ontologies generally describe specific domains such as scientific research areas.
Example:
Ontology depicting Movie:-

Components:
1. Individuals –
Individuals are also known as instances of objects or concepts.It may or may not be present
in an ontology.It represents the atomic level of an ontology.
For example, in the above ontology of movie, individuals can be a film (Titanic), a director
(James Cameron), an actor (Leonardo DiCaprio).
2. Classes –
Sets of collections of various objects are termed as classes.
For example, in the above ontology representing movie, movie genre (e.g. Thriller, Drama),
types of person (Actor or Director) are classes.
3. Attributes –
Properties that objects may possess.
For example, a movie is described by the set of ‘parts’ it contains like Script, Director, Actors.
4. Relations –
Ways in which concepts are related to one another.
For example, as shown above in the diagram a movie has to have a script and actors in it.
3.1 ONTOLOGICAL ENGINEERING

Concepts such as Events, Time, Physical Objects, and Beliefs— that occur in many
different domains. Representing these abstract concepts is sometimes called ontological
engineering.

Figure 3.13 The upper ontology of the world, showing the topics to be
covered later in the chapter. Each link indicates that the lower concept is a
specialization of the upper one. Specializations are not necessarily disjoint;
a human is both an animal and an agent, for example.

The general framework of concepts is called an upper ontology because of the


convention of drawing graphs with the general concepts at the top and the more specific
concepts below them, as in Figure

Categories and Objects

The organization of objects into categories is a vital part of knowledge representation.


Although interaction with the world takes place at the level of individual objects, much
reasoning takes place at the level of categories.

For example, a shopper would normally have the goal of buying a basketball, rather
than a particular basketball such as BB9 There are two choices for representing categories in
first-order logic: predicates and objects. That is, we can use the predicate Basketball (b), or we
can reify1 the category as an object, Basketballs.

We could then say Member(b, Basketballs ), which we will abbreviate as b∈


Basketballs, to say that b is a member of the category of basketballs. We say
Subset(Basketballs, Balls), abbreviated as Basketballs ⊂ Balls, to say that Basketballs is a
subcategory of Balls. Categories serve to organize and simplify the knowledge base through
inheritance. If we say that all instances of the category Food are edible, and if we assert that
Fruit is a subclass of Food and Apples is a subclass of Fruit, then we can infer that every apple
is edible. We say that the individual apples inherit the property of edibility, in this case from
their membership in the Food category. First-order logic makes it easy to state facts about
categories, either by relating objects to categories or by quantifying over their members. Here
are some types of facts, with examples of each:

• An object is a member of a category.


BB9 ∈ Basketballs
• A category is a subclass of another category. Basketballs ⊂ Balls
• All members of a category have some properties.

(x∈ Basketballs) ⇒ Spherical (x)


• Members of a category can be recognized by some properties.
Orange(x) 𝖠 Round (x) 𝖠 Diameter(x)=9.5 𝖠 x∈ Balls ⇒ x∈ Basketballs
• A category as a whole has some properties.

Dogs ∈ Domesticated Species

Notice that because Dogs is a category and is a member of Domesticated Species, the
latter must be a category of categories. Categories can also be defined by providing necessary
and sufficient conditions for membership. For example, a bachelor is an unmarried adult male:

x∈ Bachelors ⇔ Unmarried(x) 𝖠 x∈ Adults 𝖠 x∈ Males

Physical Composition

We use the general PartOf relation to say that one thing is part of another. Objects can
be grouped into part of hierarchies, reminiscent of the Subset hierarchy:

PartOf (Bucharest, Romania)


PartOf (Romania, EasternEurope)
PartOf(EasternEurope, Europe)
PartOf (Europe, Earth)
The PartOf relation is transitive and reflexive; that is,
PartOf (x, y) 𝖠PartOf (y, z) ⇒PartOf (x, z)

PartOf (x, x)
Therefore, we can conclude PartOf (Bucharest, Earth).
For example, if the apples are Apple1, Apple2, and Apple3, then
BunchOf ({Apple1,Apple2,Apple3})
denotes the composite object with the three apples as parts (not elements). We can
define BunchOf in terms of the PartOf relation. Obviously, each element of s is part
of

BunchOf (s): ∀x x∈ s ⇒PartOf (x, BunchOf (s)) Furthermore, BunchOf (s) is the
smallest object satisfying this condition. In other words, BunchOf (s) must be part of
any object that has all the elements of s as parts:
∀y [∀x x∈ s ⇒PartOf (x, y)] ⇒PartOf (BunchOf (s), y)
Measurements

In both scientific and commonsense theories of the world, objects have height, mass,
cost, and so on. The values that we assign for these properties are called measures.
Length(L1)=Inches(1.5)=Centimeters(3.81)

Conversion between units is done by equating multiples of one unit to another:


Centimeters(2.54 ×d)=Inches(d)

Similar axioms can be written for pounds and kilograms, seconds and days, and dollars
and cents. Measures can be used to describe objects as follows:

Diameter (Basketball12)=Inches(9.5)
ListPrice(Basketball12)=$(19)
d∈ Days ⇒ Duration(d)=Hours(24)

Time Intervals

Event calculus opens us up to the possibility of talking about time, and time intervals.
We will consider two kinds of time intervals: moments and extended intervals. The distinction
is that only moments have zero duration:

Partition({Moments,ExtendedIntervals}, Intervals )
i∈Moments⇔Duration(i)=Seconds(0)

The functions Begin and End pick out the earliest and latest moments in an interval,
and the function Time delivers the point on the time scale for a moment.

The function Duration gives the difference between the end time and the start time.

Interval (i) ⇒Duration(i)=(Time(End(i)) − Time(Begin(i)))


Time(Begin(AD1900))=Seconds(0)
Time(Begin(AD2001))=Seconds(3187324800)
Time(End(AD2001))=Seconds(3218860800)
Duration(AD2001)=Seconds(31536000)

Two intervals Meet if the end time of the first equals the start time of the second. The
complete set of interval relations, as proposed by Allen (1983), is shown graphically in Figure
12.2 and logically below:

Meet(i,j) ⇔ End(i)=Begin(j)
Before(i,j) ⇔ End(i) < Begin(j)
After (j,i) ⇔ Before(i, j)
During(i,j) ⇔ Begin(j) < Begin(i) < End(i) < End(j)
Overlap(i,j) ⇔ Begin(i) < Begin(j) < End(i) < End(j)
Begins(i,j) ⇔ Begin(i) = Begin(j)
Finishes(i,j) ⇔ End(i) = End(j)

Equals(i,j) ⇔ Begin(i) = Begin(j) 𝖠 End(i) = End(j)

Figure 3.14 Predicates on time intervals

3.7 EVENTS

Event calculus reifies fluents and events. The fluent At(Shankar, Berkeley) is an
object that refers to the fact of Shankar being in Berkeley, but does not by itself say anything
about whether it is true. To assert that a fluent is actually true at some point in time we use the
predicate T, as in T(At(Shankar, Berkeley), t). Events are described as instances of event
categories. The event E1 of Shankar flying from San Francisco to Washington, D.C. is
described as E1 ∈Flyings𝖠 Flyer (E1, Shankar ) 𝖠 Origin(E1, SF) 𝖠 Destination (E1,DC)
we can define an alternative three-argument version of the category of flying events and say
E1 ∈Flyings(Shankar, SF,DC) We then use Happens(E1, i) to say that the event E1 took
place over the time interval i, and we say the same thing in functional form with Extent(E1)=i.
We represent time intervals by a (start, end) pair of times; that is, i = (t1, t2) is the time interval
that starts at t1 and ends at t2. The complete set of predicates for one version of the event
calculus is T(f, t) Fluent f is true at time t Happens(e, i) Event e happens over the time interval
i Initiates(e, f, t) Event e causes fluent f to start to hold at time t Terminates(e, f, t) Event e
causes fluent f to cease to hold at time t Clipped(f, i) Fluent f ceases to be true at some point
during time interval i Restored (f, i) Fluent f becomes true sometime during time interval i We
assume a distinguished event, Start, that describes the initial state by saying which fluents are
initiated or terminated at the start time. We define T by saying that a fluent holds at a point in
time if the fluent was initiated by an event at some time in the past and was not made false
(clipped) by an intervening event. A fluent does not hold if it was terminated by an event and
not made true (restored) by another event. Formally, the axioms are:

Happens(e, (t1, t2)) 𝖠Initiates(e, f, t1) 𝖠¬Clipped(f, (t1, t)) 𝖠 t1 < t ⇒T(f,
t)Happens(e, (t1, t2)) 𝖠 Terminates(e, f, t1)𝖠¬Restored (f, (t1, t)) 𝖠 t1 < t ⇒¬T(f, t)
where Clipped and Restored are defined by Clipped(f, (t1, t2)) ⇔∃ e, t, t3 Happens(e,
(t, t3))𝖠 t1 ≤ t < t2 𝖠 Terminates(e, f, t) Restored (f, (t1, t2)) ⇔∃ e, t, t3 Happens(e, (t,
t3)) 𝖠 t1 ≤ t < t2 𝖠 Initiates(e, f, t)

3.8 MENTAL EVENTS AND MENTAL OBJECTS

What we need is a model of the mental objects that are in someone’s head (or
something’s knowledge base) and of the mental processes that manipulate those mental objects.
The model does not have to be detailed. We do not have to be able to predict how many
milliseconds it will take for a particular agent to make a deduction. We will be happy just to be
able to conclude that mother knows whether or not she is sitting.

We begin with the propositional attitudes that an agent can have toward mental objects:
attitudes such as Believes, Knows, Wants, Intends, and Informs. The difficulty is that these
attitudes do not behave like “normal” predicates.

For example, suppose we try to assert that Lois knows that Superman can fly: Knows
(Lois, CanFly(Superman)) One minor issue with this is that we normally think of
CanFly(Superman) as a sentence, but here it appears as a term. That issue can be patched up
just be reifying CanFly(Superman); making it a fluent. A more serious problem isthat, if it is
true that Superman is Clark Kent, then we must conclude that Lois knows that Clark can fly:
(Superman = Clark) 𝖠Knows(Lois, CanFly(Superman)) |= Knows(Lois, CanFly (Clark))
Modal logic is designed to address this problem. Regular logic is concerned with a single
modality, the modality of truth, allowing us to express “P is true.” Modal logic includes special
modal operators that take sentences (rather than terms) as arguments.

For example, “A knows P” is represented with the notation KAP, where K is the modal
operator for knowledge. It takes two arguments, an agent (written as the subscript) and a
sentence. The syntax of modal logic is the same as first-order logic, except that sentences can
also be formed with modal operators. In first-order logic a model contains a set of objects and
an interpretation that maps each name to the appropriate object, relation, or function. In modal
logic we want to be able to consider both the possibility that Superman’s secret identity is Clark
and that it isn’t. Therefore, we will need a more complicated model, one that consists of a
collection of possible worlds rather than just one true world. The worlds are connected in a
graph by accessibility relations, one relation for each modal operator. We say that world w1 is
accessible from world w0 with respect to the modal operator KA if everything in w1 is
consistent with what A knows in w0, and we write this as Acc(KA,w0,w1). In diagrams such
as Figure 12.4 we show accessibility as an arrow between possible worlds. In general, a
knowledge atom KAP is true in world w if and only if P is true in every world accessible from
w. The truth of more complex sentences is derived by recursive application of this rule and the
normal rules of first-order logic. That means that modal logic can be used to reason about
nested knowledge sentences: what one agent knows about another agent’s knowledge. For
example, we can say that, even though Lois doesn’t know whether Superman’s secret identity
is Clark Kent, she does know that Clark knows: KLois [KClark Identity(Superman, Clark )
∨KClark¬Identity(Superman, Clark )] Figure 3.15 shows some possible worlds for this
domain, with accessibility relations for Lois and Superman

Figure 3.15

In the TOP-LEFT diagram, it is common knowledge that Superman knows his own
identity, and neither he nor Lois has seen the weather report. So in w0 the worlds w0 and w2
are accessible to Superman; maybe rain is predicted, maybe not. For Lois all four worlds are
accessible from each other; she doesn’t know anything about the report or if Clark is Superman.
But she does know that Superman knows whether he is Clark, because in every world that is
accessible to Lois, either Superman knows I, or he knows ¬I. Lois does not know which is the
case, but either way she knows Superman knows. In the TOP-RIGHT diagram it is common
knowledge that Lois has seen the weather report. So in w4 she knows rain is predicted and in
w6 she knows rain is not predicted. Superman does not know the report, but he knows that Lois
knows, because in every world that is accessible to him, either she knows R or she knows ¬
R. In the BOTTOM diagram we represent the scenario where it is common knowledge that
Superman knows his identity, and Lois might or might not have seen the weather report. We
represent this by combining the two top scenarios, and adding arrows to show that Superman
does not know which scenario actually holds. Lois does know, so we don’t need to add any
arrows for her. In w0 Superman still knows I but not R, and now he does not know whether
Lois knows R. From what Superman knows, he might be in w0 or w2, in which case Lois does
not know whether R is true, or he could be in w4, in which case she knows R, or w6, in which
case she knows ¬R.
CLASSICAL PLANNING
Classical Planning is the planning where an agent takes advantage of the problem structure to
construct complex plans of an action. The agent performs three tasks in classical planning:

 Planning: The agent plans after knowing what is the problem.


 Acting: It decides what action it has to take.
 Learning: The actions taken by the agent make him learn new things.

A language known as PDDL(Planning Domain Definition Language) which is used to represent


all actions into one action schema.
PDLL describes the four basic things needed in a search problem:

 Initial state: It is the representation of each state as the conjunction of the ground and
functionless atoms.
 Actions: It is defined by a set of action schemas which implicitly define
the ACTION() and RESULT() functions.
 Result: It is obtained by the set of actions used by the agent.
 Goal: It is same as a precondition, which is a conjunction of literals (whose value is either
positive or negative).
There are various examples which will make PDLL understandable:

 Air cargo transport


 The spare tire problem
 The blocks world and many more.

Let’s discuss one of them

 Air cargo transport

This problem can be illustrated with the help of the following actions:

 Load: This action is taken to load cargo.


 Unload: This action is taken to unload the cargo when it reaches its destination.
 Fly: This action is taken to fly from one place to another.

Therefore, the Air cargo transport problem is based on loading and unloading the cargo and flying it
from one place to another.
Below is the PDLL description for Air cargo transport:
Init (On(C1, SFO) ? On(C2, JFK) ? On(P1, SFO) ? On(P2, JFK)? Cargo(C1) ? Cargo(C2) ?
Plane(P1) ? Plane(P2)
? Airport (JFK) ? Airport (SFO))
Goal (On(C1, JFK) ? On(C2, SFO))
Action(Load (c, p, a),
PRECOND: On(c, a) ? On(p, a) ? Cargo(c) ? Plane(p) ? Airport (a)
EFFECT: ? On(c, a) ? In(c, p))
Action(Unload(c, p, a),
PRECOND: In(c, p) ? On(p, a) ? Cargo(c) ? Plane(p) ? Airport (a)
EFFECT: On(c, a) ? ?In(c, p))
Action(Fly(p, from, to),
PRECOND: On(p, from) ? Plane(p) ? Airport (from) ? Airport (to)
EFFECT: ? On(p, from) ? On(p, to))
The above described actions, (i.e., load, unload, and fly) affects the following two predicates:

 (c,p): In this, the cargo is inside the plane p.


 (x,a): In this, the object x is at the airport a. Here, object can be the cargo or plane.
It is to be noted that when the plane flies from one place to another, it should carry all cargo inside
it. It becomes difficult with the PDLL to give solution for such a problem. Because PDLL do not
have the universal quantifier. Thus, the following approach is used:

 piece of cargo ceases to beOn anywhere when it is In a plane.


 the cargo only becomesOn the new airport when it is unloaded.

Therefore, the planning for the solution is:


Load (C1, P1, SFO), Fly(P1, SFO, JFK),Unload(C1, P1, JFK),
Load (C2, P2, JFK), Fly(P2, JFK, SFO),Unload(C2, P2, SFO)] .

 The spare tire problem

The problem is that the agent needs to change the flat tire. The aim is to place a good spare tire over
the car’s axle. There are four actions used to define the spare tire problem:

1. Remove the spare from the trunk.


2. Remove the flat spare from the axle.
3. Putting the spare on the axle.
4. Leave the car unattended overnight. Assuming that the car is parked at an unsafe
neighborhood.
The PDLL description for the spare tire problem is:
Init(Tire1(Flat ) ? Tire1(Spare) ? At(Flat , Axle) ? At(Spare, Trunk ))
Goal (At(Spare, Axle))
Action(Remove(obj , loc),
PRECOND: At(obj , loc)
EFFECT: ? At(obj , loc) ? At(obj , Ground))
Action(PutOn(t , Axle),
PRECOND: Tire1(t) ? At(t , Ground) ?¬At(Flat , Axle)
EFFECT: ? At(t , Ground) ? At(t , Axle))
Action(LeaveOvernight ,
PRECOND:
EFFECT: ? At(Spare, Ground) ?¬At(Spare, Axle) ?¬At(Spare, Trunk)
?¬At(Flat, Ground) ?¬At(Flat , Axle) ?¬At(Flat, Trunk))
The solution to the problem is:
[Remove(Flat,Axle),Remove(Spare,Trunk), PutOn(Spare, Axle)].
Similarly, we can design PDLL for various problems.
Complexity of the classical planning
In classical planning, there occur following two decision problems:

1. PlanSAT: It is the question asking if there exists any plan that solves a planning problem.
2. Bounded PlanSAT: It is the question asking if there is a solution of length k or less than it.

We found that:

 PlanSAT and Bounded PlanSAT are decidable for classical planning.


 Both decision problems lie in the complexity class PSPACE, which is larger than NP.

Note: PSPACE is the class which refers to those problems that can be solved via deterministic
Turing machine under a polynomial time space.
From the above, it can be concluded that:

1. PlanSAT is P whereas Bounded PlanSAT is NP-complete.


2. Optimal planning is hard with respect to sub-optimal planning.

Advantages of Classical Planning


There are following advantages of Classical planning:

 It has provided the facility to develop accurate domain-independent heuristics.


 The systems are easy to understand and work efficiently.

Heuristic Functions in Artificial


Intelligence
Heuristic Functions in AI: As we have already seen that an informed search make use of
heuristic functions in order to reach the goal node in a more prominent way. Therefore, there are
several pathways in a search tree to reach the goal node from the current node. The selection of a
good heuristic function matters certainly. A good heuristic function is determined by its efficiency.
More is the information about the problem, more is the processing time.
Some toy problems, such as 8-puzzle, 8-queen, tic-tac-toe, etc., can be solved more efficiently
with the help of a heuristic function. Let’s see how:
Consider the following 8-puzzle problem where we have a start state and a goal state. Our task is
to slide the tiles of the current/start state and place it in an order followed in the goal state. There
can be four moves either left, right, up, or down. There can be several ways to convert the
current/start state to the goal state, but, we can use a heuristic function h(n) to solve the problem
more efficiently.
A heuristic function for the 8-puzzle problem is defined below:
h(n)=Number of tiles out of position.
So, there is total of three tiles out of position i.e., 6,5 and 4. Do not count the empty tile present in
the goal state). i.e. h(n)=3. Now, we require to minimize the value of h(n) =0.
We can construct a state-space tree to minimize the h(n) value to 0, as shown below:

It is seen from the above state space tree that the goal state is minimized from h(n)=3 to h(n)=0.
However, we can create and use several heuristic functions as per the reqirement. It is also clear
from the above example that a heuristic function h(n) can be defined as the information required to
solve a given problem more efficiently. The information can be related to the nature of the state,
cost of transforming from one state to another, goal node characterstics, etc., which is
expressed as a heuristic function.
Properties of a Heuristic Search Algorithm

Use of heuristic function in a heuristic search algorithm leads to following properties of a heuristic
search algorithm:

 Admissible Condition: An algorithm is said to be admissible, if it returns an optimal


solution.
 Completeness: An algorithm is said to be complete, if it terminates with a solution (if the
solution exists).
 Dominance Property: If there are two admissible heuristic
algorithms A1 and A2 having h1 and h2 heuristic functions, then A1 is said to
dominate A2 if h1 is better than h2 for all the values of node n.
 Optimality Property: If an algorithm is complete, admissible, and dominating other
algorithms, it will be the best one and will definitely give an optimal solution.

MODEL BASED HEURISTICS:

The heuristic method refers to finding the best possible solution to a problem quickly, effectively, and
efficiently. The word heuristic is derived from an ancient Greek word, 'eurisko.' It means to find, discover, or
search. It is a practical method of mental shortcut for problem-solving and decision making that reduces the
cognitive load and doesn't require to be perfect. The method is helpful in getting a satisfactory solution to a
much larger problem within a limited time frame.

The trial and error heuristic is the most fundamental heuristic. It can be applied in all situations, from
matching nuts and bolts to finding the answer related to algebra. Some common heuristics used to solve
metamaterial problems are visual representation, forward/backward reasoning, additional assumptions, and
simplification.

Four Principles of the Heuristic Method


György (George) Pólya gave the four principles of the heuristic method in his book. The book was published
in 1945 with the title 'How to solve it'. These principles should be followed in the proper sequence in which
they are given; otherwise, it can be difficult to find the solution to the problem. That's why they are also
called the four steps of the method.

o First Principle - Understanding the Problem: It is the first step to solve a problem. This is the most important
principle because before solving a problem, it is required to understand the real problem. But many people
skip this principle of finding the initial suitable approach. The principle is focused on knowing the problem and
looking at the problem from other angles.
The various aspects covered under this principle are: what is the problem, what is going on, is there any other
way to explain the problem, is there all required information available, etc. These all points help in
understanding the actual problem and its aspects.
o Second Principle - Making a Plan: A problem can be solved by using many different ways. The second
principle says that it is required to find the best way that can be used to find the solution to the given problem.
For this purpose, the right strategy is the first find the requirement. The reverse 'working backward' can help
with this. In this, people assume to have a solution that helps them in solving the problem from the starting
point.

It also helps in making an overview of the possibilities, removing the less efficient immediately, comparing all
the remaining possibilities, or applying symmetry. This improves the judgment ability as well as the creativity
of a person.
o Third Principle - Implementing the Plan: After making the proper strategy, the plan can be implemented.
However, for this, it is necessary to be patient and give the required time to solve the problem. Because
implementing the plan is tougher than making a plan. If the plan does not provide any solution or does not
stand as per the expectations, then it is advised to repeat the second principle in a better way.
o Fourth Principle - Evaluation and Adaptation: This principle evaluated that things are in the planned way. In
other words, it said that we match the planned way with the standard way. After this, it is found that the things
are going well maintained so that the best way of solving the problem can get. Some plans may work while
others may not. So, after the proper evaluation, the most appropriate way can be adapted to solve the main
problem.

Types of Heuristic Methods


Several heuristic methods were also used by Pólya. Some of the most popular methods are discussed below:

o Dividing Technique: Under this technique, the original problem is divided into smaller pieces or sub-problems
so that the answer can be found more easily. After solving these sub-problems separately, they can be merged
to get the final answer of the solution of the original problem.
o Inductive Method: This method involves a smaller problem than the original problem, which has been solved
already. The original bigger problem can be solved by deriving the generalization from the smaller problem or
by using the same method that is applied in the previous problem.
o Reduction Method: As we know, the problem is solved by different factors and causes, this method sets
various limits for the main problem in advance. It is helpful in reducing the leeway of the original problem and
getting the solution easily.
o Constructive Method: Under this method, the problem is solved step by step, and when the first step is passed,
the solution is taken as a victory. After it, consecutive steps are taken to reach the final stage. It helps in getting
the best way to solve the problem and getting a successful result.
o Local Search Method: In this method, the most feasible way of solving a problem is searched and used.
Continuous improvement is made in the method during the solving process, and when there is no more scope
for improvement, the method gets to the end, and the final result is the answer to the problem.

Uses of Heuristic in Various Fields

Psychology:

o Informal Modes of Heuristic:


o Affect Heuristic: Emotion is used as a mental shortcut to affect a decision. Emotion is the driving force
behind making a decision or solving an issue fast and effectively. It's used to assess the dangers and
advantages of something based on the pleasant or negative emotions people connect with a stimulus.
It can also be termed a gut decision because if the gut feeling is correct, the rewards will outweigh the
risks.
o Familiarity Heuristic: A mental shortcut used in various scenarios in which people presume that the
circumstances that led to previous conduct are still true in the present situation and that the previous
behavior may thus be applied correctly to the new situation. This is true when the person is under a lot
of mental strain.
o Peak-end Rule: An event's experience is rated solely on the sentiments felt at the event's apex.
Typically, not every event is viewed as complete, but rather what the spectator felt at the climax, whether
the event was pleasant or painful. All other emotions aren't lost, but they aren't used. It can also contain
the duration of the event.
o Some Other Types:
o Balance Heuristic
o Bade Rate Heuristic
o Common Sense Heuristic
o Anchoring and Adjustment
o Availability Heuristic
o Contagion Heuristic
o Default Heuristic
o Educated Guess Heuristic
o Effort Heuristic
o Escalation of Commitment
o Fairness Heuristic
o Naïve Diversification
o Representativeness Heuristic
o Scarcity Heuristic
o Simulation Heuristic
o Social Proof
o Working Backward
o Formal Modes of Heuristic:
o The heuristic of Aspects Elimination
o Fast-and-frugal trees
o Fluency heuristic
o Gaze heuristic
o Recognition heuristic
o Satisficing
o Similarity heuristic
o Take-the-best heuristic

Cognitive Maps:
Cognitive maps were also discovered to be manipulated and created using heuristics. Internal representations
of our physical environment, particularly linked with spatial relationships, are known as cognitive maps. Our
memory uses these internal representations as a guide in our external surroundings. When asked about map
imaging, distancing, and other topics, it was discovered that respondents frequently distorted visuals. The
regularization of photographs gave rise to these aberrations.
Philosophy: An excellent example is a model that is a heuristic device for comprehending what it models
because it is never identical to what it models. In this sense, heuristics include stories, analogies, and the like.
The concept of utopia, as articulated in Plato's best-known work, The Republic, is a classic example. It
implies that the "ideal city" represented in The Republic is neither offered as a goal to strive for or as a
guiding principle for growth. Rather, it demonstrates how everything would have to be connected and how
one thing would lead to another (sometimes with disastrous consequences) if particular principles were
chosen and followed to the letter.

The noun heuristic is frequently used to define a rule-of-thumb, technique, or method. Heuristics are
important in creative thinking and the formation of scientific hypotheses, according to science philosophers.

Law: Heuristics are used in legal theory, particularly in the theory of law and economics, when a step-by-
step analysis is practicable, insofar as "practicality" is determined by the interests of a governing body.

The current securities regulatory structure is based on the assumption that all investors are completely
rational. Actual investors are constrained by cognitive biases, heuristics, and framing effects. For example,
the legal drinking age for unaccompanied persons in all states and the United States is 21 years. It is
considered that people must be mature enough to make judgments considering the risks of alcohol intake.
Given that people mature at varying rates, the age of 21 may be too late for some and too early for others.
The rather arbitrary deadline is adopted in this circumstance because it is hard or impracticable to determine
whether an individual is mature enough for society to trust them with such a high level of responsibility.
However, other proposed amendments include completing an alcohol education course on the condition for
legal alcohol possession rather than reaching 21. Because completion of such a course would probably be
optional rather than mandatory, teenage alcohol policy would be more case-by-case instead of heuristic.

Stereotyping: The heuristic method is also used by people to make opinions or judgments about things
that are not familiar to them or which they have never seen. They work as a mental shortcut to guessing
everything about a person as per his/her social status, actions, and background. It's not just related to making
assumptions about a person but also about an event, experience, and all the other things. It can be pure
guessing also. Stereotypes, as initially defined by journalist Walter Lippmann in his book Public
Opinion (1922), are mental images formed by our experiences and the information we are given about the
world.

Artificial Intelligence: This method is also helpful in AI to find the solution space. In artificial intelligence
systems, a heuristic can be used to seek a solution space. The heuristic is obtained by modifying the weight
of branches based on how likely each branch is to lead to a destination node or by applying a function that
the designer has programmed into the system.

What can manipulate decision-making?


Many different ways can be used by a person in making a decision. These different ways can be used in
different situations as per their suitability. You should understand the role of each type so that you can decide
which way you should choose in which condition. But even after the availability of these ways, a person can
get manipulated while making a decision. Such a situation can arise due to the following conditions:
o Availability: In this way, the decision is made upon the quick availability of some thoughts in mind. When we
want to make a decision, we get confused with various relevant examples. These examples are a part of your
memory. Now the question arises which example you should choose. So you can go with the example that is
more commonly or frequently get available in your mind.
For example, you want to travel from Agra to Delhi, and you choose the train as a medium of traveling. But
suddenly, you think of the number of recent train accidents. You might feel the train travel is unsafe, and you
shifted to road travel. As the thoughts that came to your mind first about train travel are dangerous, this
changes your attitude towards train traveling. Here the availability heuristic forced you to think that train
accidents are more common than they are.
o Representativeness: Under representative heuristic, the decision is made by the comparison of the current
situation with the most representative mental prototype. If you are trying to decide something, you can relate
or compare the current situation with a past situation or a mental example and can decide on the basis of it.
For example, a strange older man might remind you of your grandfather. In this situation, your mind will
compare that older man with your grandfather and immediately assume that the person will be kind, gentle,
and trustworthy because your grandfather has similar qualities.
o Affect: The affect heuristic helps in decision-making by influencing the emotions and feelings that you
experience in a particular situation. For example, as per some research, it is found that a person focuses on
the potential downsides of a decision while in a negative mood. He/she does not see the positive side and
possible benefits of the decision that can be affected by deciding with negative emotions. On the other hand,
when the person decides in a positive mood, then he/she can see all the benefits and lower the risks of the
decision. It shows how a person's mood or emotion towards a situation can affect the decision.
o Anchoring: The anchoring bias or anchoring heuristic refers to decide something by getting over-influenced
with the first bit of information that is available to you. This does not let the person consider other factors. In
this way, he/she may choose a wrong or bad decision and can't find the best possible decision. For example,
you are going shopping, and you've pre-determined how much you will pay for a dress. Now suppose, in the
first shop you get the dress at a 5% discount, and you jump over it without any bargaining or searching for a
better deal. Hence, it may be possible that your decision is not as good as it could be. This shows the anchoring
bias.

Heuristic Method of Teaching


Under this technique, an issue is presented to the students, and they are asked to solve it using multiple
literacy resources such as the library, laboratory, and workshops. The teacher's responsibility is to initiate
learning, and students participate actively throughout the process. They strive to develop relevant solutions
based on some reasoning by employing their creative thinking and imaginative abilities. They learn from
their own mistakes. The main focus of this teaching strategy is on:

o To foster a problem-solving mindset.


o To foster scientific perspectives on the issue.
o To increase one's ability to express oneself.

Its fundamental concepts are as follows:

o At any given time, to as few people as feasible.

o Encourage the learner to discover as much as possible about oneself.

Heuristic Method vs. Exact Solution


The features that make the heuristic method different and superior from the exact solution method are as
under:

Heuristic Method Exact Solution Method

The heuristic method is a mathematical method that The exact solution method focuses on finding the
provides a good solution with proof to a particular optimal solution to a problem.
problem.

This method consumes less time. This method consumes more time.

It provides a good, immediate, short-term goal or It provides an optimal, perfect, or rational solution or
approximate solution or decision. decision.

It is more flexible. It is less flexible.

DESCRIPTION LOGIC IN AI

Why Description Logics ?


If FOL is directly used without some kind of restriction, then
The structure of the knowledge/information is lost (no variables, concepts as classes, and roles as
properties),
 The expressive power of FOL is too high for having good (computational properties and
efficient procedures).
Description Logics
Description logics are a family of logics concerned with knowledge representation.
A description logic is a decidable fragment of first-order logic, associated with a set of automatic
reasoning procedures.
The basic constructs for a description logic are the notion of a concept and the notion of a
relationship.
Complex concept and relationship expressions can be constructed from atomic concepts and
relationships with suitable constructs between them.
Example :
HumanMother 드 Female ㄇ∃ HasChild.Person
Axioms, Disjunctions and Negations

Teaching-Assistant ⊑ ㄱUndergrad ⊔ Professor


∀x. Teaching-Assistant(x) ➝ㄱ Undergrad(x) ⋁ Professor(x)

A necessary condition in order to be a teaching assistant is to be either not undergraduates or a


professor. Clearly, a graduated student being a teaching assistant is not necessarily a professor;
moreover, it may be the case that some professor is not graduated.

Teaching-Assistant ≐ ㄱUndergrad ⊔ Professor


∀x. Teaching-Assistant(x) ↔️ Undergrad(x) ⋁ Professor(x)

When the left-hand side is an atomic concept, the 드 symbol introduces a primitive definition (giving
only necessary condition) while the ≐ symbol introduces a real definition. With necessary and
sufficient conditions.
In general, it is possible to have complex concept expressions at the left-hand side as well.

Most known description logics are :

FL - The simplest and less expressive description logic.


C,D ➝A | C ⊓ D丨∀R.C | ∃R

ALC - A more practical and expressive description logic.


C,D ➝ A| T | ⊥ | ㄱA | C ⊓ D丨∀R.C | ∃R.T

SHOIN - Very popular description logic .


The logic underlying OWL.
DLR idf - Very expressive description logic,
Capable of representing most database constructs.
Description logic ALC (Syntax and Semantic)

Example :

Woman 드 Person ⊓ Female Man 드 Person ⊓ㄱFemale


Parent 드 Person ⊓ ∃hasChild. т NotParent 드 Person ㄇ ∃hasChild.⊥
Closed Propositional Logic
Conjunction is interpreted as intersection of sets of individuals.
Disjunction is interpreted as union of sets of individuals.
Negation is interpreted as complement of sets of individuals.
Conceptual Dependency (CD).
Conceptual Dependency (CD)
This representation is used in natural language processing in order to represent them earning of the sentences in such a
way that inference we can be made from the sentences. It is independent of the language in which the sentences were
originally stated. CD representations of a sentence is built out of primitives , which are not words belonging to the
language but are conceptual , these primitives are combined to form the meaning s of the words. As an example consider
the event represented by the sentence.

In the above representation the symbols have the following meaning:


Arrows indicate direction of dependency
Double arrow indicates two may link between actor and the action
P indicates past tense
ATRANS is one of the primitive acts used by the theory . it indicates transfer of possession
0 indicates the object case relation
R indicates the recipient case relation
Conceptual dependency provides a str5ucture in which knowledge can be represented and also a set of building
blocks from which representations can be built. A typical set of primitive actions are
ATRANS - Transfer of an abstract relationship(Eg: give)
PTRANS - Transfer of the physical location of an object(Eg: go)
PROPEL - Application of physical force to an object (Eg: push)
MOVE - Movement of a body part by its owner (eg : kick)
GRASP - Grasping of an object by an actor(Eg: throw)
INGEST - Ingesting of an object by an animal (Eg: eat)
EXPEL - Expulsion of something from the body of an animal (cry)
MTRANS - Transfer of mental information(Eg: tell)
MBUILD - Building new information out of old(Eg: decide)
SPEAK - Production of sounds(Eg: say)
ATTEND - Focusing of sense organ toward a stimulus (Eg: listen)
A second set of building block is the set of allowable dependencies among the conceptualization describe in a sentence.
Artificial Neural Network
Artificial Neural Network Tutorial provides basic and advanced concepts of ANNs. Our Artificial Neural Network
tutorial is developed for beginners as well as professions.

The term "Artificial neural network" refers to a biologically inspired sub-field of artificial intelligence modeled after
the brain. An Artificial neural network is usually a computational network based on biological neural networks that
construct the structure of the human brain. Similar to a human brain has neurons interconnected to each other, artificial
neural networks also have neurons that are linked to each other in various layers of the networks. These neurons are
known as nodes.

Artificial neural network tutorial covers all the aspects related to the artificial neural network. In this tutorial, we will
discuss ANNs, Adaptive resonance theory, Kohonen self-organizing map, Building blocks, unsupervised learning,
Genetic algorithm, etc.

What is Artificial Neural Network?


The term "Artificial Neural Network" is derived from Biological neural networks that develop the structure of a
human brain. Similar to the human brain that has neurons interconnected to one another, artificial neural networks also
have neurons that are interconnected to one another in various layers of the networks. These neurons are known as
nodes.

The given figure illustrates the typical diagram of Biological Neural Network.

The typical Artificial Neural Network looks something like the given figure.

Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks, cell nucleus represents
Nodes, synapse represents Weights, and Axon represents Output.
Relationship between Biological neural network and artificial neural network:

Biological Neural Network Artificial Neural Network

Dendrites Inputs

Cell nucleus Nodes

Synapse Weights

Axon Output

An Artificial Neural Network in the field of Artificial intelligence where it attempts to mimic the network of neurons
makes up a human brain so that computers will have an option to understand things and make decisions in a human-
like manner. The artificial neural network is designed by programming computers to behave simply like interconnected
brain cells.

There are around 1000 billion neurons in the human brain. Each neuron has an association point somewhere in the
range of 1,000 and 100,000. In the human brain, data is stored in such a manner as to be distributed, and we can extract
more than one piece of this data when necessary from our memory parallelly. We can say that the human brain is made
up of incredibly amazing parallel processors.

We can understand the artificial neural network with an example, consider an example of a digital logic gate that takes
an input and gives an output. "OR" gate, which takes two inputs. If one or both the inputs are "On," then we get "On"
in output. If both the inputs are "Off," then we get "Off" in output. Here the output depends upon input. Our brain does
not perform the same task. The outputs to inputs relationship keep changing because of the neurons in our brain, which
are "learning."

The architecture of an artificial neural network:


To understand the concept of the architecture of an artificial neural network, we have to understand what a neural
network consists of. In order to define a neural network that consists of a large number of artificial neurons, which are
termed units arranged in a sequence of layers. Lets us look at various types of layers available in an artificial neural
network.

Artificial Neural Network primarily consists of three layers:


Input Layer:

As the name suggests, it accepts inputs in several different formats provided by the programmer.

Hidden Layer:

The hidden layer presents in-between input and output layers. It performs all the calculations to find hidden features
and patterns.

Output Layer:

The input goes through a series of transformations using the hidden layer, which finally results in output that is
conveyed using this layer.

The artificial neural network takes input and computes the weighted sum of the inputs and includes a bias. This
computation is represented in the form of a transfer function.

It determines weighted total is passed as an input to an activation function to produce the output. Activation functions
choose whether a node should fire or not. Only those who are fired make it to the output layer. There are distinctive
activation functions available that can be applied upon the sort of task we are performing.

Advantages of Artificial Neural Network (ANN)


Parallel processing capability:

Artificial neural networks have a numerical value that can perform more than one task simultaneously.

Storing data on the entire network:

Data that is used in traditional programming is stored on the whole network, not on a database. The disappearance of a
couple of pieces of data in one place doesn't prevent the network from working.

Capability to work with incomplete knowledge:

After ANN training, the information may produce output even with inadequate data. The loss of performance here relies
upon the significance of missing data.

Having a memory distribution:

For ANN is to be able to adapt, it is important to determine the examples and to encourage the network according to
the desired output by demonstrating these examples to the network. The succession of the network is directly
proportional to the chosen instances, and if the event can't appear to the network in all its aspects, it can produce false
output.

Having fault tolerance:

Extortion of one or more cells of ANN does not prohibit it from generating output, and this feature makes the network
fault-tolerance.

Disadvantages of Artificial Neural Network:


Assurance of proper network structure:

There is no particular guideline for determining the structure of artificial neural networks. The appropriate network
structure is accomplished through experience, trial, and error.

Unrecognized behavior of the network:

It is the most significant issue of ANN. When ANN produces a testing solution, it does not provide insight concerning
why and how. It decreases trust in the network.

Hardware dependence:

Artificial neural networks need processors with parallel processing power, as per their structure. Therefore, the
realization of the equipment is dependent.

Difficulty of showing the issue to the network:

ANNs can work with numerical data. Problems must be converted into numerical values before being introduced to
ANN. The presentation mechanism to be resolved here will directly impact the performance of the network. It relies on
the user's abilities.

The duration of the network is unknown:

The network is reduced to a specific value of the error, and this value does not give us optimum results.

How do artificial neural networks work?


Artificial Neural Network can be best represented as a weighted directed graph, where the artificial
neurons form the nodes. The association between the neurons outputs and neuron inputs can be viewed
as the directed edges with weights. The Artificial Neural Network receives the input signal from the
external source in the form of a pattern and image in the form of a vector. These inputs are then
mathematically assigned by the notations x(n) for every n number of inputs.
Afterward, each of the input is multiplied by its corresponding weights ( these weights are the details utilized by the
artificial neural networks to solve a specific problem ). In general terms, these weights normally represent the strength
of the interconnection between neurons inside the artificial neural network. All the weighted inputs are summarized
inside the computing unit.

If the weighted sum is equal to zero, then bias is added to make the output non-zero or something else to scale up to the
system's response. Bias has the same input, and weight equals to 1. Here the total of weighted inputs can be in the range
of 0 to positive infinity. Here, to keep the response in the limits of the desired value, a certain maximum value is
benchmarked, and the total of weighted inputs is passed through the activation function.

The activation function refers to the set of transfer functions used to achieve the desired output. There is a different
kind of the activation function, but primarily either linear or non-linear sets of functions. Some of the commonly used
sets of activation functions are the Binary, linear, and Tan hyperbolic sigmoidal activation functions. Let us take a look
at each of them in details:

Binary:
In binary activation function, the output is either a one or a 0. Here, to accomplish this, there is a threshold value set
up. If the net weighted input of neurons is more than 1, then the final output of the activation function is returned as
one or else the output is returned as 0.

Sigmoidal Hyperbolic:
The Sigmoidal Hyperbola function is generally seen as an "S" shaped curve. Here the tan hyperbolic function is used
to approximate output from the actual net input. The function is defined as:

F(x) = (1/1 + exp(-????x))

Where ???? is considered the Steepness parameter.


Types of Artificial Neural Network:
There are various types of Artificial Neural Networks (ANN) depending upon the human brain neuron and network
functions, an artificial neural network similarly performs tasks. The majority of the artificial neural networks will have
some similarities with a more complex biological partner and are very effective at their expected tasks. For example,
segmentation or classification.

Feedback ANN:
In this type of ANN, the output returns into the network to accomplish the best-evolved results internally. As per
the University of Massachusetts, Lowell Centre for Atmospheric Research. The feedback networks feed information
back into itself and are well suited to solve optimization issues. The Internal system error corrections utilize feedback
ANNs.

Feed-Forward ANN:

A feed-forward network is a basic neural network comprising of an input layer, an output layer, and at least one layer of a
neuron. Through assessment of its output by reviewing its input, the intensity of the network can be noticed based on group
behavior of the associated neurons, and the output is decided. The primary advantage of this network is that it figures out how to
evaluate and recognize input patterns.

LONG SHORT-TERM MEMORY (LSTM):

What are LSTM Networks

LST Memory is a sophisticated version of the recurrent neural networks (RNN) design that was created to represent
chronological sequences and their long-range dependencies more precisely than traditional RNNs. Its main features are
the internal design of an LSTM cell and the various modifications introduced into the LSTM structure, and a few
applications of LSTMs that are in high demand. The article also provides an examination of LSTMs as well as GRUs.
The tutorial ends with a list that outlines the drawbacks associated with the LSTM network and a description of the
new models based on attention, which are swiftly replacing LSTMs in the real world.

Introduction:

LSTM networks extend the recurrent neural network (RNNs) mainly designed to deal with situations in which RNNs
do not work. When we talk about RNN, it is an algorithm that processes the current input by taking into account the
output of previous events (feedback) and then storing it in the memory of its users for a brief amount of time (short -
term memory). Of the many applications, its most well-known ones are those in the areas of non-Markovian speech
control and music composition. However, there are some drawbacks to RNNs. It is the first to fail to save information
over long periods of time. Sometimes an ancestor of data stored a considerable time ago is needed to determine the
output of the present. However, RNNs are utterly incapable of managing these "long-term dependencies." The second
issue is that there is no better control of which component of the context is required to continue and what part of the
past must be forgotten. Other issues associated with RNNs are the exploding or disappearing slopes (explained later)
that occur in training an RNN through backtracking. Therefore, the Long-Short-Term Memory (LSTM) was introduced
into the picture. It was designed so that the problem of the gradient disappearing is eliminated almost entirely as the
training model is unaffected. Long-time lags within specific issues are solved using LSTMs, which also deal with the
effects of noise, distributed representations, or endless numbers. With LSTMs, they do not meet the requirement to
maintain the same number of states before the time required by the hideaway Markov model (HMM). LSTMs offer us
an extensive range of parameters like learning rates and output and input biases. Therefore, there is no need for minor
adjustments. The effort to update each weight is decreased to O(1) by using LSTMs like those used in Back Propagation
Through Time (BPTT), which is a significant advantage.

Exploding and Vanishing Gradients:

In training a network, the primary objective is to reduce the losses (in terms of cost or error) seen in the output of the
network when training data is passed through it. We determine the gradient, or loss in relation to a weight set, then
adjust the weights in accordance with this, and repeat this process until we arrive at an optimal set of weights that will
ensure the loss is as low as. This is the idea behind reverse-tracking. Sometimes, it is the case that the gradient becomes
minimal. It is important to note that the amount of gradient in one layer is determined by some aspects of the following
layers. If any component is tiny (less one), The result is that the gradient will appear smaller. This is also known as "the
scaling effect. If this effect is multiplied by the rate of learning, which is a tiny value that ranges from 0.1-to 0.001, this
produces a lower value. This means that the change in weights is minimal and produces nearly the same results as
before. If the gradients are significant because of the vast components and the weights are changed to be higher than
the ideal value. The issue is commonly referred to as the issue of explosive gradients. To stop this effect of scaling, the
neural network unit was rebuilt such that the scale factor was set to one. The cell was then enhanced by a number of
gating units and was named the LSTM.

Architecture:

The main difference between the structures that comprise RNNs as well as LSTMs can be seen in the fact that the
hidden layer of LSTM is the gated unit or cell. It has four layers that work with each other to create the output of the
cell, as well as the cell's state. Both of these are transferred to the next layer. Contrary to RNNs, which comprise the
sole neural net layer made up of Tanh, LSTMs are comprised of three logistic sigmoid gates and a Tanh layer. Gates
were added to restrict the information that goes through cells. They decide which portion of the data is required in the
next cell and which parts must be eliminated. The output will typically fall in the range of 0-1, where "0" is a reference
to "reject all' while "1" means "include all."

Hidden layers of LSTM:

Each LSTM cell is equipped with three inputs and two outputs, ht, and Ct. At a specific time, t, which ht is the hidden
state, and Ct is the cell state or memory. It xt is the present information point or the input. The first sigmoid layer
contains two inputs: ht-1 and xt, where ht-1 is the state hidden in the cell before it. It is also known by its name and the
forget gate since its output is a selection of the amount of data from the last cell that should be included. Its output will
be a number [0,1] multiplied (pointwise) by the previous cell's state .
Applications:

LSTM models have to be trained using a training dataset before being used for real-world use. The most challenging
applications are listed in the following sections:

1. Text generation or language modelling involves the calculation of words whenever a sequence of words is
supplied as input. Language models can be used at the level of characters or n-gram level as well as at the
sentence or the level of a paragraph.
2. Image processing is the process of the analysis of a photograph and converting the result into sentences. In order
to do this, we will need to have a set of data consisting of many photos with the appropriate descriptive captions.
A model that has been trained can determine the characteristics of images in the data. It is a photo dataset. The
data is processed to include only those words that suggest the most. It is text data. By combining these two types
of information, we will try to make the model work. The model's job is to produce a descriptive phrase for the
image, one word at the moment, using input words predicted by the model and the image.
3. Speech and Handwriting Recognition
4. The process of music generation is identical to text generation, where LSTMs can predict the musical notes, not
text, by studying a mix of notes fed into the input.
5. Language Translation involves translating a sequence of one language to a similar sequence in a different
language. Like image processing, an image-based dataset that includes words and translations is cleaned first
before the relevant portion to build the model. An encoder-decoder LSTM model can convert the input
sequences into their formatted vector (encoding) and then convert the translated version.

Drawbacks:

Everything in the world indeed has its advantages and disadvantages. LSTMs are no exception, and they also come
with a few disadvantages that are discussed below:

1. They became popular due to the fact that they solved the issue of gradients disappearing. However that they are
unable to eliminate the problem. The issue lies in that data needs to be moved between cells for its analysis.
Furthermore, the cell is becoming extremely complex with the addition of functions (such as the forget gate)
that are now part of the picture.
2. They require lots of time and resources to be trained and prepared for real-world applications. Technically
speaking, they require high memory bandwidth due to the linear layers present within each cell, which the
system is usually unable to supply. Therefore, in terms of hardware, LSTMs become pretty inefficient.
3. With the growing technology of data, mining scientists are searching for a system that is able to store past data
for more extended periods of time than LSTMs. The motivation behind the development of such a model is the
habit of humans of dividing a particular chunk of information into smaller parts to facilitate recollection.
4. LSTMs are affected by various random weights and behave similarly to neural networks that feed forward. They
favour small initialization over large weights.
AI - Natural Language Processing
Natural Language Processing (NLP) refers to AI method of communicating with an intelligent systems using
a natural language such as English.
Processing of Natural Language is required when you want an intelligent system like robot to perform as per
your instructions, when you want to hear decision from a dialogue based clinical expert system, etc.
The field of NLP involves making computers to perform useful tasks with the natural languages humans use.
The input and output of an NLP system can be −

 Speech
 Written Text
Components of NLP
There are two components of NLP as given −
Natural Language Understanding (NLU)
Understanding involves the following tasks −

 Mapping the given input in natural language into useful representations.


 Analyzing different aspects of the language.
Natural Language Generation (NLG)
It is the process of producing meaningful phrases and sentences in the form of natural language from some
internal representation.
It involves −
 Text planning − It includes retrieving the relevant content from knowledge base.
 Sentence planning − It includes choosing required words, forming meaningful phrases, setting tone
of the sentence.
 Text Realization − It is mapping sentence plan into sentence structure.
The NLU is harder than NLG.

Difficulties in NLU
NL has an extremely rich form and structure.
It is very ambiguous. There can be different levels of ambiguity −
 Lexical ambiguity − It is at very primitive level such as word-level.
 For example, treating the word “board” as noun or verb?
 Syntax Level ambiguity − A sentence can be parsed in different ways.
 For example, “He lifted the beetle with red cap.” − Did he use cap to lift the beetle or he lifted a beetle
that had red cap?
 Referential ambiguity − Referring to something using pronouns. For example, Rima went to Gauri.
She said, “I am tired.” − Exactly who is tired?
 One input can mean different meanings.
 Many inputs can mean the same thing.
NLP Terminology
 Phonology − It is study of organizing sound systematically.
 Morphology − It is a study of construction of words from primitive meaningful units.
 Morpheme − It is primitive unit of meaning in a language.
 Syntax − It refers to arranging words to make a sentence. It also involves determining the structural
role of words in the sentence and in phrases.
 Semantics − It is concerned with the meaning of words and how to combine words into meaningful
phrases and sentences.
 Pragmatics − It deals with using and understanding sentences in different situations and how the
interpretation of the sentence is affected.
 Discourse − It deals with how the immediately preceding sentence can affect the interpretation of the
next sentence.
 World Knowledge − It includes the general knowledge about the world.
Steps in NLP
There are general five steps −
 Lexical Analysis − It involves identifying and analyzing the structure of words. Lexicon of a language
means the collection of words and phrases in a language. Lexical analysis is dividing the whole chunk
of txt into paragraphs, sentences, and words.
 Syntactic Analysis (Parsing) − It involves analysis of words in the sentence for grammar and
arranging words in a manner that shows the relationship among the words. The sentence such as
“The school goes to boy” is rejected by English syntactic analyzer.

 Semantic Analysis − It draws the exact meaning or the dictionary meaning from the text. The text is
checked for meaningfulness. It is done by mapping syntactic structures and objects in the task domain.
The semantic analyzer disregards sentence such as “hot ice-cream”.
 Discourse Integration − The meaning of any sentence depends upon the meaning of the sentence
just before it. In addition, it also brings about the meaning of immediately succeeding sentence.
 Pragmatic Analysis − During this, what was said is re-interpreted on what it actually meant. It
involves deriving those aspects of language which require real world knowledge.
Implementation Aspects of Syntactic Analysis
There are a number of algorithms researchers have developed for syntactic analysis, but we consider only
the following simple methods −

 Context-Free Grammar
 Top-Down Parser
Let us see them in detail −

Context-Free Grammar
It is the grammar that consists rules with a single symbol on the left-hand side of the rewrite rules. Let us
create grammar to parse a sentence −
“The bird pecks the grains”
Articles (DET) − a | an | the
Nouns − bird | birds | grain | grains
Noun Phrase (NP) − Article + Noun | Article + Adjective + Noun
= DET N | DET ADJ N
Verbs − pecks | pecking | pecked
Verb Phrase (VP) − NP V | V NP
Adjectives (ADJ) − beautiful | small | chirping
The parse tree breaks down the sentence into structured parts so that the computer can easily understand
and process it. In order for the parsing algorithm to construct this parse tree, a set of rewrite rules, which
describe what tree structures are legal, need to be constructed.
These rules say that a certain symbol may be expanded in the tree by a sequence of other symbols.
According to first order logic rule, if there are two strings Noun Phrase (NP) and Verb Phrase (VP), then the
string combined by NP followed by VP is a sentence. The rewrite rules for the sentence are as follows −
S → NP VP
NP → DET N | DET ADJ N
VP → V NP
Lexocon −
DET → a | the
ADJ → beautiful | perching
N → bird | birds | grain | grains
V → peck | pecks | pecking
The parse tree can be created as shown −
Now consider the above rewrite rules. Since V can be replaced by both, "peck" or "pecks", sentences such
as "The bird peck the grains" can be wrongly permitted. i. e. the subject-verb agreement error is approved
as correct.
Merit − The simplest style of grammar, therefore widely used one.
Demerits −
 They are not highly precise. For example, “The grains peck the bird”, is a syntactically correct
according to parser, but even if it makes no sense, parser takes it as a correct sentence.
 To bring out high precision, multiple sets of grammar need to be prepared. It may require a completely
different sets of rules for parsing singular and plural variations, passive sentences, etc., which can
lead to creation of huge set of rules that are unmanageable.
Top-Down Parser
Here, the parser starts with the S symbol and attempts to rewrite it into a sequence of terminal symbols that
matches the classes of the words in the input sentence until it consists entirely of terminal symbols.
These are then checked with the input sentence to see if it matched. If not, the process is started over again
with a different set of rules. This is repeated until a specific rule is found which describes the structure of the
sentence.
Merit − It is simple to implement.
Demerits −

 It is inefficient, as the search process has to be repeated if an error occurs.


 Slow speed of working.

You might also like