Slide-1 Hello Everyone!! Welcome You All at Centre For Future Technologies in This Session We Will Discuss About The Word Level Analysis

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

Slide-1

Hello everyone!! Welcome you all at centre for future


technologies
In this session we will discuss about the
Word Level Analysis

Slide-2
Regular Expressions
A regular expression (RE) is a language for specifying text search strings. RE helps us to
match or find other strings or sets of strings, using a specialized syntax held in a pattern.
Regular expressions are used to search texts in UNIX as well as in MS WORD in identical
way. We have various search engines using a number of RE features.

Slide-3
Properties of Regular Expression (Contd….)
 American Mathematician Stephen Cole Kleene formalized the Regular
Expression language.
 RE is a formula in a special language, which can be used for specifying
simple classes of strings, a sequence of symbols. In other words, we can say
that RE is an algebraic notation for characterizing a set of strings.
 Regular expression requires two things, one is the pattern that we wish to
search and other is a corpus of text from which we need to search.

Slide-4
• Examples of Regular Expressions
Regular Regular Set
Expressions

(0 + 10*) L = { 0, 1, 10, 100, 1000, 10000, … }

(0*10*) L = {1, 01, 10, 010, 0010, …}

(0 + ε)(1 + ε) L = {ε, 0, 1, 01}

Set of strings of a's and b's of any length including the null string. So L = { ε, a,
(a+b)*
b, aa , ab , bb , ba, aaa…….}
Slide-5
Characteristics of Regular Sets
 The union of two regular sets then the resulting set would also
be regular.
 The intersection of two regular sets then the resulting set would
also be regular
 The complement of regular sets, then the resulting set would
also be regular.

Slide-6
Characteristics of Regular Sets (Contd….)
 The difference of two regular sets, then the resulting set would
also be regular.
 The reversal of regular sets, then the resulting set would also be
regular.
 The closure of regular sets, then the resulting set would also be
regular.

The concatenation of two regular sets, then the resulting set


would also be regular.

Slide-7
What is Finite State Automata
Automata mean “self-regulating mechanism", which may be defined
as an conceptual self-drive computing device that follows a
preprogrammed sequence of steps automatically.
Finite State automata have a finite number of states that is why it is
called Finite State Automata.
Scientifically, an automaton can be represented by a 5-tuple (Q, Σ,
δ, q0, F.Q means finite set of states. Σ is a finite set of symbols,
called the alphabet of the automaton. δ means transition function
q0 means the initial state from where any input is processed (q0 ∈
Q). F means a set of final states of Q (F ⊆ Q).

Slide-8
Regular Expressions Regular Grammars and Finite
Automata
Finite state automata are the abstract foundation of computational
work and regular expressions is a way of relating them.
Regular expression is a way to describe a kind of language called
Regular language.

Regular language can be described with the help of Finite State


Automata and regular expression.
Regular grammar is a formal grammar that can be right regular or left
regular.
Regular expression can be executed by Finite State Automata and
Finite State Automata can be described with a Regular Expression.

Slide-9
Regular Expressions Regular Grammars and Finite
Automata represented here in an image form
Slide-10
Types of Finite State Automation
Deterministic Finite automation
 In finite automation for every input symbol we can determine
the state to which the machine will move. It has a finite number
of states that is why we call this as Deterministic Finite
Automaton.

Slide-11
Types of Finite State Automation (Contd….)
 Deterministic Finite Automaton represented by a 5-Parameters
(Q, Σ, δ, Q0, F), where −
 Q is a finite set of states.
 Σ is a finite set of symbols, called the alphabet of the
automaton.
 δ is the transition function where δ: Q × Σ → Q .
 Q0 is the initial state from where any input is processed (Q0
∈ Q).
 F is a set of final states of Q (F ⊆ Q).

Slide-12
Types of Finite State Automation (Contd….)
 Deterministic Finite Automaton represented by a 5-Parameters
(Q, Σ, δ, Q0, F), where −
 Q is a finite set of states.
 Σ is a finite set of symbols, called the alphabet of the
automaton.
 δ is the transition function where δ: Q × Σ → Q .
 Q0 is the initial state from where any input is processed (Q0
∈ Q).
 F is a set of final states of Q (F ⊆ Q).

Slide-13
Types of Finite State Automation (Contd….)
 Deterministic Finite Automaton can be represented by diagraphs
called state diagrams where −
 The states are represented by vertices.
 The transitions are shown by labeled arcs.
 The initial state is represented by an empty incoming arc.
 The final state is represented by double circle.

Slide-14
Types of Finite State Automation (Contd….)
Here we have shown the finite state automation with this
table in which we have the different reaching state on the
different input.

Slide-15
Types of Finite State Automation (Contd….)
• The graphical representation of this Deterministic Finite
Automaton would be as follows −
Slide-16
Types of Finite State Automation (Contd….)
 Non-deterministic Finite Automation

 Finite automation where for every input symbol we cannot


determine the state to which the machine will move the
machine can move to any combination of the states.
 It has a finite number of states that is why the machine is
called Non-Deterministic Finite Automation.
Slide-17
Types of Finite State Automation (Contd….)
• Non-deterministic Finite Automation can be represented by a 5-
tuple Q, Σ, δ, Q0, F
Q is a finite set of states.
Σ is a finite set of symbols, called the alphabet of the automaton.
δ is the transition function where δ: Q × Σ → 2Q.
Q0 is the initial state from where any input is processed (Q0 ∈
Q).
F is a set of final state/states of Q (F ⊆ Q).

Slide-18
Types of Finite State Automation (Contd….)
Non-deterministic Finite Automation can be represented by diagraphs
called state diagrams
 The states are represented by vertices.
 The transitions are reflected by labeled arcs.
 The initial state is represented by an empty incoming arc
Slide-19
Types of Finite State Automation (Contd….)
Example
Suppose a NDFA be
Q = {a, b, c},
Σ = {0, 1},
q0 = {a},
F = {c},

Slide-20
Types of Finite State Automation (Contd….)
The graphical representation of this would be as follows

Slide-21
Morphological Parsing
Morphological Parsing involves the recognizing of the problem.
In this a word breakdowns into smaller meaningful pieces called
morphemes generating some sort of linguistic structure for it, like we
can break the word boxes into two, box and es.

Slide-22
Morphological Parsing (Contd….)
 Morphology is the study of :
 The formation of words.
 Use of prefixes and suffixes in the formation of words.
 The origin of the words.
 How parts-of-speech (PoS) of a language are formed.
 Grammatical forms of the words.
 Morphemes, the smallest meaning-bearing units, can be divided
into two types −
1. Stems
2. Word Order

Slide-23
Morphological Parsing (Contd….)

Stems
Stem is the meaningful unit of a word. Stem is the root of the word
like, in the word boxes, the stem is box.

Slide-24
Morphological Parsing (Contd….)
• Affixes − As the name suggests, they add some additional
meaning and grammatical functions to the words. For example,
in the word boxes, the affix is es.

Slide-25
Morphological Parsing (Contd….)
 Affixes can also be divided into four types −
 Prefixes 
 Suffixes
 Infixes
 Circumfixes 

Slide-26
Morphological Parsing (Contd….)
Word Order
Order of the words would be decided by morphological parsing.
Lexicon
The very first requirement for building a morphological parser is
lexicon, which includes the list of stems and affixes along with the
basic information about them, the information like whether the stem is
Noun stem or Verb stem, etc.

Slide-27
Morphological Parsing (Contd….)
Morphotactics
This model explains which classes of morphemes can follow other
classes of morphemes inside a word. English plural morpheme always
follows the noun rather than preceding it.
Orthographic rules
Orthographic rules are used to model the changes occurring in a word
like rule of converting y to ie in word like pasty+s = pasties not
pastys.
Slide-28
Thank you

You might also like