CSE437 Assignment 8

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Assignment

2130734 - Ikram Hossain Akif

Whenever we are creating a context-free language, construction of its grammar is critical.


Each string belonging to the language can be derived using the grammar, and produces a
unique parse tree; or does it?

In some cases, the grammar may be ambiguous, where one string may be parsed in multiple
ways. For example, the grammar S → SS | a accepts the string aaa, but this may be parsed in multiple
ways:

To remove ambiguity, we can restructure the grammar to enforce precedence and ensure only one
accepted path is available. For the above example, this could be S → aS | a, which removes the
options and clarifies which is the parse tree for the string aaa:
After we know every string definitely produces a single, unique parse tree once ambiguity is
removed, we can start thinking of the converse – how do we check if a string is accepted?
With regular languages, this was simpler; we would simply parse the string and see if we
end up at a final state. But with context-free grammars, we only have the start variable and
rules to begin with. Any combination of paths could be valid, and we have no choice but to
exhaust them all.

This is computationally expensive, and absurdly tedious. It would benefit us greatly if we


come up with a way to eliminate alternative paths without parsing intermediate strings all the
way and realising it does not match the string needed. The simplest idea would be to remove
a path whenever the intermediate string is longer than the required string. But this posits
problems in two unique cases – ε-productions.

Whenever a production includes ε, it contradicts a critical assumption of our initial idea - that
when the intermediate string reaches a length longer than required, it cannot decrease. But ε
is an empty character and can be ignored in the final string. So when a variable may be
replaced by ε, the length of the string may decrease. For example, in the grammar
S → 0S1 | ε, if we want to derive the string 0011, which is accepted, we would follow these steps:

S ⇒ 0S1
⇒ 00S11
⇒ 0011

Here we see that to derive the required string, it is necessary to exceed the length of the
string being checked. So our idea will fail if there are any empty productions or ε-productions
present. This means that we must devise a way to remove ε-productions from the grammar
but have it still represent the same language.

To do this we employ these steps: if the start variable can be replaced by ε, we leave it be
and use another variable to represent the rest of the derivation; if the variable produces ε
and nothing more, we replace it everywhere in the grammar; if the variable has other
productions besides ε, we keep the original rules with that variable but also add another
duplicate where that variable has been replaced by ε.

For example, the grammar S → AS | SB | ε, A → ε, B → 0B | ε, can be transformed to remove ε-


productions:
Now that we can ensure the length will never decrease after increasing, we could imagine that parsing
strings is a lot more feasible and our idea to compare the length may prove to be worthwhile. But
there is another issue - unit productions. If there are rules that create a chain such as S → A and A →
S, this may result in an infinitely looping path and must be eliminated. We can remove unit
productions using the two scenarios they appear in; if two variables appear in a cycle where each can
be replaced by the other (A → S, S → A), we simply replace one of them with another; if it is only
one way, we skip the unit production to consider its results (A → S, S → a, then A → a).

You might also like