What It Is

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

What it is?

The Pumping Lemma is a property of all regular The pumping lemma of context-free languages tell us that If there was a string long enough to 1. REGULAR
cause a cycle (same variable appears more than once in the derivation), then we can always
languages.How is it used? A technique that is used to show that a base case:
find two pieces of this sufficiently long string to "pump" in tandem and discover an infinite
given language is not regular Let L be a regular language sequence of strings that had to be in the language. o The empty set ∅ and the empty string ε are regular expressions.
That is: if we repeat each of the two pieces the same number of times, we get another string in o For each symbol a in Σ, the singleton set {a} is a regular
This property should hold for all regular languages.
the language. expression.
Then there exists some constant N such that for every string w ∈ L s.t. Inductive steps:
|w|≥N, there exists a way to break w into three parts, w=xyz, such Let L be a CFL. o If r and s are regular expressions, then the following are also
that: Then there exists a constant N, s.t., regular expressions:
1. y≠ ε • if z ∈L s.t. |z|≥N, then we can write z=uvwxy, such that: § Concatenation: rs (the concatenation of r and s).
2. |xy|≤N 1. |vwx| ≤ N § Union: r | s (the union of r and s).
3. For all k≥0, all strings of the form xykz ∈ L § Closure: r* (the Kleene closure of r)
L is regular => it should have a DFA.
2. vx≠ε 2.e-nfa
4. Set N := number of states in the D FA • For all k≥0: uvkwxky ∈ L An ε-NFA (epsilon-NondeterminisAc Finite Automaton) is a type of
5. Any string w∈L, s.t. |w|≥N, should have the form: w=a1a2…am, PROOF nondeterminisAc finite automaton (NFA) extended with ε-transiAons, or
where m≥N If L=Φ or contains only ε, then the lemma is trivially sa>sfied (as epsilon-transiAons. These transiAons allow the automaton to move from one
6. Let the states traversed after reading the first N symbols be: it cannot be violated) state to another without consuming any input symbol.
{p0,p1,… pN} • For any other L which is a CFL: Formally, an ε-NFA is a 5-tuple (Q, Σ, δ, q0, F), where:
7. ==> There are N+1 p-states, while there are only N DFA states • Q is a finite set of states.
• Let G be a CNF grammar for L
8. ==> at least one state has to repeat • Σ is a finite set of input symbols (the alphabet).
• Let m = number of variables in G
i.e, pi= pJwhere 0≤i<j≤N (by PHP) m • δ is the transition function, which maps Q × (Σ ∪ {ε}) to 2^Q (the power
2. We should be able to break w=xyz as follows: • Choose N=2 .
set of Q), representing the set of states the machine can move to from a
1. x=a1a2..ai; y=ai+1ai+2..aJ; z=aJ+1aJ+2..am • Pick any z ∈ L s.t. |z|≥ N given state with a given input symbol or ε.
2. x’s path will be p0..pi the parse tree for z should have a height ≥ m+1 • q0 is the initial state.
3. y’s path will be pi pi+1..pJ (but pi=pJ implying a loop) (by the parse tree theorem) • F is a set of final (or accepting) states.
4. z’s path will be pJpJ+1..pm In an ε-NFA, the ε-transition allows the machine to move from one state to
3. Now consider another another without consuming any input symbol. This feature enhances the
string wk=xykz , where k≥0 expressive power of the automaton by allowing more flexibility in recognizing
4. Case k=0 languages.
1. DFA will reach the accept state pm
5. Case k>0 3-NFA
1. DFA will loop for yk, and finally reach the accept state pm for z
6. In either case, wk∈ L A Nondeterministic Finite Automaton (NFA) is a mathematical model used in
7. For part (1): computer science and formal language theory to describe languages recognized
8. Since i<j, y ≠ ε by finite automata. An NFA consists of a finite set of states, an input alphabet,
9. For part (2): transition rules, an initial state, and one or more final states.
10. By PHP, the repetition of states has to occur within the first N symbols in Unlike deterministic finite automata (DFAs), NFAs allow for nondeterminism in
w the transition function, meaning that there can be multiple possible next states
11. ==> |xy|≤N for a given input symbol and current state. In other words, at any given point
during the input processing, an NFA can transition to one or more possible
states depending on the current state and the input symbol.
Languages recognized by NFAs are known as regular languages.

4-DFA 6-CONTEXT FREE LANGUAGE


A DeterminisAc Finite Automaton (DFA) is a mathemaAcal model used in A context-free language is a formal language that can be generated by a 8-CHOMSKY HIERARCHY
computer science and formal language theory to describe languages context-free grammar (CFG) or recognized by a non-determinisAc pushdown The Chomsky hierarchy, proposed by linguist and cogniAve scienAst Noam
recognized by finite automata. A DFA consists of a finite set of states, an input automaton (PDA). These languages are a fundamental concept in formal Chomsky in the 1950s, is a classificaAon of formal grammars and the languages
alphabet, transiAon rules, an iniAal state, and one or more final states. language theory and computer science. they generate. It provides a way to categorize formal languages based on their
Formally, a language L is considered context-free if there exists a context-free generaAve power. The hierarchy consists of four levels, each represenAng a
5-REGULAR GRAMMAR grammar G such that L = L(G), where L(G) is the set of all strings generated by class of grammars and the corresponding languages they generate. These
In Formal Languages and Automata Theory (FLAT), a regular language is a G. Equivalently, a language is context-free if there exists a non-deterministic levels are:
language that can be expressed by a regular expression or recognized by a pushdown automaton (PDA) that recognizes it. 1. Type 0: Unrestricted Grammars (Recursively Enumerable Languages)
finite automaton, such as a determinisAc finite automaton (DFA) or a Context-free languages are widely used in various areas of computer science, o Unrestricted grammars have no restrictions on the production rules.
nondeterminisAc finite automaton (NFA). including compiler construction, natural language processing, parsing, and They can generate languages that cannot be recognized by any
Formally, a regular language L over an alphabet Σ is a subset of Σ*, the set of all theoretical computer science. Many programming languages, such as C, Java, mechanical means, also known as recursively enumerable languages.
possible strings over Σ, which satisfies one of the following equivalent and Python, have syntaxes that can be described using context-free grammars. 2. Type 1: Context-Sensitive Grammars
conditions: o Context-sensitive grammars have production rules of the form αAβ →
1. L can be generated by a regular grammar. 7-PUSHDOWN AUTOMATA αγβ, where A is a non-terminal symbol, α and β are strings of terminal
2. L can be recognized by a finite automaton (DFA, NFA, or ε-NFA). A pushdown automaton (PDA) is a type of automaton used in computer science and non-terminal symbols, and γ is a non-empty string. These
3. L can be described by a regular expression. to recognize context-free languages. It extends the capabilities of finite automata grammars generate languages that are a proper subset of recursively
Regular languages are fundamental in computer science, particularly in areas by adding a stack, allowing it to recognize languages that cannot be described by enumerable languages. They are recognized by linear-bounded
such as compiler construction, pattern matching, text processing, and formal finite automata alone. automata.
language theory. They form the foundation of many important algorithms and Formally, a pushdown automaton is a 6-tuple (Q, Σ, Γ, δ, q0, F), where: 3. Type 2: Context-Free Grammars
data structures used in software development. Properties of regular languages, 1. Q is a finite set of states. o Context-free grammars have production rules of the form A → β,
such as closure properties under various operations (union, concatenation, 2. Σ is a finite set of input symbols (the alphabet). where A is a non-terminal symbol and β is a string of terminal and
Kleene closure), are extensively studied in FLAT, contributing to the 3. Γ is a finite set of stack symbols. non-terminal symbols. These grammars generate context-free
understanding of computational complexity and formal language theory. 4. δ is the transition function, which maps Q × (Σ ∪ {ε}) × Γ to 2^(Q × Γ*), languages, which are recognized by pushdown automata. Context-free
representing the possible transitions the machine can make based on the languages are widely used in computer science and linguistics.
5-CONTEXT FREE GRAMMAR current state, input symbol (or ε for ε-transitions), and top symbol of the 4. Type 3: Regular Grammars (Regular Languages)
A context-free grammar (CFG) is a formal grammar used to generate strings in stack. o Regular grammars have production rules of the form A → aB or A →
a context-free language. It consists of a set of production rules, each defining 5. q0 is the initial state. a, where A and B are non-terminal symbols, and a is a terminal
how symbols in the grammar can be replaced or expanded into other symbols. 6. F is a set of final (or accepting) states. symbol. These grammars generate regular languages, which are
Context-free grammars are widely used in formal language theory,. A pushdown automaton operates similarly to a finite automaton, but it has access recognized by finite automata. Regular languages are used in many
Formally, a context-free grammar G is defined as a 4-tuple (V, Σ, R, S), where: to an additional stack data structure. At each step, the PDA reads an input symbol, practical applications, including lexical analysis, pattern matching,
1. V is a finite set of non-terminal symbols (variables). consults its current state and the top symbol of the stack, and based on this and string processing.
2. Σ is a finite set of terminal symbols (alphabet), disjoint from V. information, it updates its state, possibly changes the stack contents, and moves o
3. R is a finite set of production rules, each rule of the form A → α, where A to the next input symbol. The PDA accepts a string if it reaches an accepting state 9-CONVERT CFG TO PDA
is a non-terminal symbol and α is a string of symbols from (V ∪ Σ)*, after processing the entire input string. Conversion of CFG to PDA consists of five steps. The steps are as follows:
representing the possible replacements or expansions for A. Pushdown automata are particularly useful for recognizing context-free • Convert the CFG productions into GNF.
4. S is the start symbol, which is a designated non-terminal symbol from V languages, as they can handle nested structures and maintain unbounded amounts • There will only be one state, "q," on the PDA.
that serves as the initial symbol for generating strings in the language. of memory (through the stack) during their computation. They are used in various • The CFG's first symbol will also be the PDA's initial symbol.
A context-free grammar is called "context-free" because the production rules areas of computer science, including compiler construction, parsing, and natural • Include the following rule for non-terminal symbols:
apply regardless of the context (surrounding symbols) of the non-terminal being language processing. The formalism of pushdown automata provides a powerful o δ(q, ε, A) = (q, α), Where the production rule is A → α
replaced. This property makes context-free grammars simpler to analyze and tool for studying the properties and capabilities of context-free languages. • Add the following rule for each terminal symbol:
work with compared to grammars with context-sensitive or unrestricted rules. o δ(q, a, a) = (q, ε) for every terminal symbol

10 - INSTANTANEOUS DESCRIPTION IN PDA 11-MULTITAPE TM 12- L1 L2 ARE EMURABLE


In a Pushdown Automaton (PDA), an instantaneous description (ID) represents A multitape Turing machine (MTM) is a variation of the classical Turing To prove that the union and intersection of recursively enumerable languages are
the current configuration of the automaton at a specific point during its machine model that features multiple tapes, each equipped with its own also recursively enumerable, we can use the properties of recursively enumerable
computation. It provides a snapshot of the state of the PDA, including the contents independent read/write head. This model extends the computational power of a languages and the definition of a recursively enumerable language.
of its stack and its current state. single tape Turing machine by allowing it to simultaneously access and
Formally, an instantaneous description for a PDA typically includes the following manipulate multiple tapes during computation. Recall that a language is recursively enumerable (RE) if there exists a Turing
components: Formally, a multitape Turing machine is defined by a 7-tuple (Q, Σ, Γ, δ, q0, machine that halts and accepts any string in the language, and either halts and
1. Current State: The state that the PDA is currently in. qaccept, qreject), where: rejects or loops indefinitely for strings not in the language.
2. Input: The remaining input string that the PDA has yet to process. • Q is a finite set of states.
3. Stack Contents: The contents of the stack, which may include symbols • Σ is the input alphabet, which does not include the special blank symbol. Let's prove each part separately:
pushed onto the stack during the computation. • Γ is the tape alphabet, which includes the input alphabet Σ as well as a
The combination of these components fully describes the state of the PDA at a special blank symbol. 1. **Union of RE Languages (L1 ∪ L2)**:
particular step in its operation. The instantaneous description allows for tracking • δ is the transition function, which maps Q × Γ^k to Q × Γ^k × {L, R}^k, - Let's denote the Turing machines that recognize L1 and L2 as M1 and M2,
and understanding how the PDA processes input strings and manipulates its stack where k is the number of tapes. This function specifies the machine's respectively.
during computation. behavior when reading the current symbol(s) on each tape and being in a - To recognize strings in the union of L1 and L2, we can construct a new Turing
During the execution of a PDA, the instantaneous description changes as the certain state. machine M that simulates both M1 and M2:
automaton reads input symbols, transitions between states, and manipulates the • q0 is the initial state. - On input w, M simulates M1 on w. If M1 accepts w, M accepts.
stack according to its transition rules. Analyzing the sequence of instantaneous • qaccept is the accepting state. - If M1 rejects or loops indefinitely, M simulates M2 on w. If M2 accepts, M
descriptions provides insight into the behavior and operation of the PDA on • qreject is the rejecting state. accepts; otherwise, it rejects.
different input strings. In a multitape Turing machine, each tape operates independently, allowing the - Since both M1 and M2 are Turing machines that recognize recursively
machine to read from or write to multiple positions on multiple tapes enumerable languages, M halts and accepts strings in L1 ∪ L2, and thus L1 ∪ L2
11-MULTITAPE TM BY SINGLE TAPE TM simultaneously. The transition function dictates how the machine updates its is recursively enumerable.
• Theorem 3.13: Every multitape Turing machine has an equivalent single- internal state and the contents of each tape based on the current configuration.
tape Turing machine. MTMs offer advantages over single tape Turing machines in terms of 2. **Intersection of RE Languages (L1 ∩ L2)**:
• Simulate an n-tape machine using a one tape machine: computational efficiency and expressiveness for certain tasks, such as - Similarly, let's denote the Turing machines that recognize L1 and L2 as M1
o Store information on all tapes on the single tape. simulating parallel algorithms, handling multiple inputs simultaneously, and and M2, respectively.
o Separate information from different tapes using a new symbol (eg simplifying certain algorithms by utilizing multiple tapes for intermediate - To recognize strings in the intersection of L1 and L2, we can construct a new
#) storage. Turing machine M that simulates both M1 and M2:
o Use a dotted symbol to represent the location of each tape head - On input w, M simulates M1 on w. If M1 rejects or loops indefinitely, M
§ When the tape head moves to a location, replace the symbol rejects.
there (eg symbol x) with the equivalent dotted symbol (ie with - If M1 accepts w, M simulates M2 on w. If M2 accepts, M accepts; otherwise,
dotted x) it rejects.
§ When the tape head moves from a location, write a regular (ie - Since both M1 and M2 are Turing machines that recognize recursively
non-dotted) symbol at that location enumerable languages, M halts and accepts strings in L1 ∩ L2, and thus L1 ∩ L2
§ Requires adding dotted symbols to tape alphabet is recursively enumerable.
o Develop new transition function:
§ A step of the n-tape machine will represent steps on all n of Therefore, we have proved that the union and intersection of recursively
the parts of the single tape enumerable languages are also recursively enumerable.
§ A right movement onto the # results in shifting to the right one
symbol all of the symbols that are to the right of the #
13-LINEAR BOUNDED AUTOMATA 13 POST CORRESPONDENCE PROBLEM
12- HALTING PROBLEM
Linear Bounded Automata (LBA) are a class of computational devices that serve The Halting Problem is determining whether a computer program will eventually stop The Post Correspondence Problem (PCP), introduced by Emil Post in 1946,
as a restricted form of Turing machines. LBAs operate on finite input strings or run forever. Creating a general algorithm that can accurately predict this for all is an undecidable decision problem. The PCP problem over an alphabet ∑
while restricting their tape usage to a space proportional to the length of the input programs is impossible. Alan Turing's proof showed no way to solve the Halting is stated as follows −
Problem for all cases.
string. This constraint ensures that LBAs cannot move beyond a bounded region Given the following two lists, M and N of non-empty strings over ∑ −
Before moving on to the proof, let's first understand some terms.
of the tape, making them less powerful than Turing machines, which have
3. Undecidable Problems
unbounded tape length. An undecidable problem is a sort of computational problem requiring a yes/no M = (x1, x2, x3,………, xn)
answer but where no computer program can give the proper answer all of the time;
Formally, a Linear Bounded Automaton is defined by a 6-tuple (Q, Σ, Γ, δ, q0, F), that is, any possible algorithm or program would sometimes give the wrong answer N = (y1, y2, y3,………, yn)
where: or run forever without providing any answer.
Step 1: Assume we can create a machine called HM(P, I), where HM is the Halting We can say that there is a Post Correspondence Solution, if for some
- Q is a finite set of states. machine, P is the program, and I is the input. After receiving both inputs, the machine i1,i2,………… ik, where 1 ≤ ij ≤ n, the condition xi1 …….xik =
HM will output whether or not the program P terminates. yi1 …….yik satisfies.
- Σ is the input alphabet.
- Γ is the tape alphabet, which includes Σ as well as a special blank symbol.
- δ is the transition function, which maps Q × Γ to Q × Γ × {L, R}, specifying the
machine's behavior when reading the current symbol and being in a certain state.
- q0 is the initial state.
- F is a set of final states.
Step 2: Now, create an inverted halting machine IM that takes a program P as input
The key restriction of LBAs is that their tape usage is bounded by a linear function and,
of the input size. More precisely, the tape head of an LBA can only move within • Loops forever If HM returns YES.
a constant factor of the length of the input string. This limitation ensures that • Halts if HM returns NO.

LBAs cannot use an arbitrarily large amount of tape, as Turing machines can.

Linear Bounded Automata are particularly useful for studying problems that
require bounded space, such as parsing context-sensitive languages or performing
certain types of string manipulation. They also serve as an important theoretical
tool for understanding the limitations of computational devices with restricted
resources, contributing to the study of complexity theory and formal language
theory.
Step 3: Now, take a situation where the program IM is passed to the IM function as
an input. Here, we got a contradiction. Let's understand how.

You might also like