I had never heard of “Reynolds Covering” before, but it is a pretty natural definition for a relation between grammars.
The problem: Reynolds Covering for Context-Free Grammars. This is problem AL11 in the appendix.
The description: (I’m sort of combining the definitions in G&J with the definitions in the paper together, to form what I think is the clearest explanation). Given two context-free grammars G = (M, Σ, P, S) and H = (N, Σ, Q, T). (M and N are sets of non-terminal symbols, Σ is our terminal alphabet, P and Q are production rules, and S and T are start symbols).
Is there a function f from M∪Σ to N∪Σ where:
- if x ∈ Σ, f(x) = x (the function doesn’t change any terminal symbols)
- if A ∈ M, f(A) ∈ M. (the function maps non-terminals to non-terminals)
- f(S) = T (the start symbol in G maps to the start symbol of H)
- For each production A->x1x2…xn in P, the production f(A)->f(x1)f(x2)…f(xn) is in Q.
Example: Here are two simple grammars for the “palindromes of odd length” grammar:
S->ASA | BSB | ε
T->XTX | YTY | ε
Clearly, a function that maps S to T, A to X, and B to Y will work. But notice that it’s not necessarily immediately obvious that you couldn’t map A to Y and B to X (you in fact get the same grammar, just with the productions mixed around). That’s the root of the idea that makes this problem hard.
The reduction: G&J’s reference takes you to a paper that’s from a hard to find conference. But luckily Rosenkrantz and Hunt wrote up the proof in a later paper that I found. The reduction goes from Clique. The paper calls the graph “J” to avoid confusion with the grammar G we’ll be created and asks if J has a clique of size k. J has n vertices, numbered 1 to n. We’ll build 2 grammars:
G has start symbol S, non-terminals A1 through Ak, and also non-terminals N1 through Nn, and terminals a,b,c. The production rules are:
S->aAi for each Ai
S->aNi for each Ni
Ai->bAj for all i != j
Ni->bNj if the edge (i,j) exists in the graph.
Ai->c for all i
Ni->c for all i
H has start symbol T, non-terminals N1 through Nn, and also terminals a,b,c. The production rules are:
T->aNi for all i
Ni->bNj if the edge (i,j) exists in the graph
Ni->c for all i
(So basically H is G without the “A” non-terminals and productions. As we’ll see, our function will need to map those non-terminals to the “N” non-terminals in H in a way that corresponds to a clique)
Suppose J has a clique of size k. Let’s call the vertices in the clique V1..Vk (these vertices are not necessarily N1..Nk). Here’s a function we can make:
f(S) = T
f(Ai)->NVi (The non-terminal in H corresponding to vertex Vi)
f(a)=a, f(b)=b, f(c) = c
This works because the productions in G and H are the same. The productions in G for the A non-terminals are covered by the corresponding productions of the N vertices in H. We know the productions exist because the V vertices form a clique, so all of the edges are there between any pair of vertices, and so all productions between the corresponding non-terminals are there in H.
Suppose J didn’t have a clique of size k, but we did have some mapping from G to H. The paper wants to show that we can’t have even a weaker function called a “weak interpretation” between G and H, and so the proof works in terms of that. But for our purposes, I think we can say:
If there is no clique in J, we need to assign some vertices to the A productions, and two of them, let’s say Aj and Ak would have no edge between them (otherwise the A vertices would have all possible edges between them, and form a clique). We know that G has the production Aj->bAk. If a function f exists between G and H, we would have to map Aj and Ak to two different N non-terminals, let’s say Nu and Nv. But the rules of H say that the production Nu->bNv only happens if vertices u and v were connected by an edge. But since j maps to u and k maps to v, they’re not connected by an edge, which causes a contradiction.
Difficulty: 7. I think this is actually a followable reduction. I’m always happy to see problems like Clique come back and be used in these later sections of the appendix, where it seems like we’re using 3SAT all the time.