Monthly Archives: April 2021

Context-Free Programmed Language Membership

Here’s another kind of grammar I hadn’t heard of until doing these problems.

The problem: Context-Free Programmed Language Membership.  This is problem AL17 in the appendix.

The description: Given an ε-free Context-Free programmed grammar (explained below) G, and a string x, is x in the language generated by G?

Example: A programmed grammar gives a label to each production, and then after each production a set of labels for the “next” production, depending on whether we did the production or not.

So we might write a grammar like:

1: S->aSa {2} {3}

This means that if we do the production S->aSa we next go to production 2. If we don’t, we next go to production 3

2: S->bSb {1,4}{3}

Now, if we do this production, we have a choice of 2 production rules to follow.  We can go do 1 again, or we can go to 3.

3: S->a {}{}

If we have a terminal, we have no choice for the next production.

4: S->b {} {}

So, this grammar could produce the string ababa by the productions:

1: S=>aSa.  Next, we go to 2.

2: aSa=>abSba.  Next, we’ll go to 1 (we could have also gone to 4)

1: abSba=>abSba (we decide not to do the production, and so go to 3)

3: abSba=>ababa.  (we stop because we have no non-terminals left)

Reduction: The paper by Shamir and Beeri has the reduction, but I have a hard time following their notation, so I will try to rephrase it.  The reduction is from 3-CNF-Satisfiability.

What we’ll do is start with a CNF formula.  We’ll start by creating a CFG that generates strings where for each clause (l1, l2, l3) we generate all combinations of T1, T2, T3, F1, F2, and F3 with at least one T.  This will represent the ways we can make a clause true.  The T’s and F’s will be non-terminals.

Now we will try to replace the T’s and F’s with the actual literals from the formula with the following rules:

1:  T1-> q1  {2}{3}

2: T1->q1 {2} {4}

3: T1->~q1 {3}{5}

4: F1->~q1 {4}{6}

5: F1->q1 {5}{6}

Rule 6 starts the process again on the next literal.  The idea is that with rule 1 we either replace T1 with q1 (setting variable q1 to true, and sending us down a chain that will replace all F1’s with ~q1) or won’t (which will send us to rule 3, where we set T1 to ~q1, meaning we set q1 to false, and go down that chain of replacements).

Our question then becomes can this grammar can generate the original input string of the formula?

Difficulty: 7.  I had an alternate formulation where I replaced each literal with T or F and tried to form a chain of all T’s.  I couldn’t think of a way to let the grammar define “at least one T in a clause”  and still keep it context-free (a production like TTT->”yes” is not context-free because there is more than one non-terminal on the left side), which I think is why this solution goes the other direction (the final string has the literals, where for mine, the literals in the formula were non-terminals and I wanted my terminal string to be “Yes”).  But the idea of using the program to “choose” whether a variable is set to true or false, and then you’re forced to use the substitutions given in the program to keep all of the other replacements of that variable the same isn’t that hard to get.


ETOL Grammar Non-Emptiness

I had never heard of this type of grammar before.  Let’s see if I can explain it.

The problem: ETOL (or “ET0L”, with a digit 0 instead of the letter ‘O’) Grammar Non-Emptiness.  This is problem AL16 in the appendix.

Description: An ET0L grammar is a grammar consisting of:

  • A set of non-terminal symbols N
  • A terminal alphabet Σ
  • A start symbol S from N  (The paper that has the reduction allows a start string containing multiple symbols, but we can get that by having a start production S-> the start string)
  • Productions in this grammar are defined by a set P of tables.  Entries in the table map a symbol in N or Σ to a string in (N∪Σ)*

While multiple tables are allowed, our reduction will have just one table in P.

Given a grammar G, is the language it generates non-empty?

Example:  So, to get where we’re going in the reduction, it’s helpful to note that the main difference between this kind of grammar (especially with just 1 table) and a regular Context-Free grammar is that terminals are allowed to be on the left hand side of a production.  But we don’t have to use that property.

A grammar is empty if it never produces a terminal.  So, for example:

a->ab empty because we can never get to the C production.

Reduction: The paper is by Jones and Skyum, and the reduction is from Regular Expression Non-Universality on one character alphabets.  This is Lemma 30 from that paper.  We’re given a regular expression R that came from the 3SAT production.  Recall that the expression we built looked like this:

\bigcup\limits_{k=1}^{n} \bigcap\limits_{j=2}^{p_{k}-1} [0^j \cdotp (0^{p_k})^*]

What we care about for our grammar is that there are n copies of an expression of the form 0^{p_i}0^{q_i *} (where i goes from 1 to n)

In the grammar we build, we have:

  • Symbols of the form Z_i^j, where i goes from 1 to n, and in each Z_i, the j runs from 0 to p_i + q_i -1
  • Z_1^{p_1} through Z_n^{p_n} are the non-terminals, everything else is a terminal.
  • Our start string is Z_1^0Z_1^0...Z_n^0
  • For every j except p_i + q_i -2 (the last one) we have the production Z_i^j \rightarrow Z_i^{j+1}
  • For the last j, we have the production Z_i^{p_i+q_i-1}\rightarrow Z_i^{p_i}

The paper asserts without really proving it that the language associated with this grammar is non-empty if and only if the expression we started with is not 0*.  I think this happens because we get 0* in the expression if one of the q_i values is non-zero.  If that happens, then the Z_i^{p_i+q_i-1}\rightarrow Z_i^{p_i} production isn’t a self-loop, and we can reach one of the terminal values.

Difficulty: 8.  I feel like I’m missing a lot of the point of the ET0L language by only using one table, but that’s what the reduction does.  I also wish I could confirm my explanation about why a non-empty grammar means the expression was 0*