Professional Documents
Culture Documents
CSE437 Assignment 8
CSE437 Assignment 8
CSE437 Assignment 8
In some cases, the grammar may be ambiguous, where one string may be parsed in multiple
ways. For example, the grammar S → SS | a accepts the string aaa, but this may be parsed in multiple
ways:
To remove ambiguity, we can restructure the grammar to enforce precedence and ensure only one
accepted path is available. For the above example, this could be S → aS | a, which removes the
options and clarifies which is the parse tree for the string aaa:
After we know every string definitely produces a single, unique parse tree once ambiguity is
removed, we can start thinking of the converse – how do we check if a string is accepted?
With regular languages, this was simpler; we would simply parse the string and see if we
end up at a final state. But with context-free grammars, we only have the start variable and
rules to begin with. Any combination of paths could be valid, and we have no choice but to
exhaust them all.
Whenever a production includes ε, it contradicts a critical assumption of our initial idea - that
when the intermediate string reaches a length longer than required, it cannot decrease. But ε
is an empty character and can be ignored in the final string. So when a variable may be
replaced by ε, the length of the string may decrease. For example, in the grammar
S → 0S1 | ε, if we want to derive the string 0011, which is accepted, we would follow these steps:
S ⇒ 0S1
⇒ 00S11
⇒ 0011
Here we see that to derive the required string, it is necessary to exceed the length of the
string being checked. So our idea will fail if there are any empty productions or ε-productions
present. This means that we must devise a way to remove ε-productions from the grammar
but have it still represent the same language.
To do this we employ these steps: if the start variable can be replaced by ε, we leave it be
and use another variable to represent the rest of the derivation; if the variable produces ε
and nothing more, we replace it everywhere in the grammar; if the variable has other
productions besides ε, we keep the original rules with that variable but also add another
duplicate where that variable has been replaced by ε.