Regular Expression Substitution

This week’s problem is one of those “private communication” problems, so I had to create a reduction myself. I’m not confident it’s right, so let’s find out!

The problem: Regular Expression Substitution.  This is problem SR24 in the appendix.

The description:  Given two finite alphabets X and Y (possibly with different numbers of symbols, possibly with some overlap between the symbols), and a regular expression R over X∪ Y.  Also, we have one regular expression Ri over Y for each symbol in X, and a string in Y*.  Can we find a string z in the language of R, and a string wi in each of the Ri languages such that if we start with z, and replace each symbol xi with the string wi, the resulting string is w?

Example:  Here’s a pretty simple example.  Obviously you can use much more complicated regular expressions to make much more complicated things.

X = {a,b,c}.   Y = {1,2,3,4} w = 1231122334

R = X*Y (0 or more symbols from X, ending with a symbol from Y)

R1 = 1 + 11

R2 = 2+22

R3 = 3+33

z can be abcabc4

Each occurrence of a will be replaced by a string from R1 (1 the first time, 11 the second time), each occurrence of b will be replaced by a string from R2, and each occurrence of c will be replaced by a string from R3, getting us w back.

Reduction: G&J say to use X3C.  Here’s what I think I want to do:

We’re given an instance of X3C, so a set S (normally it’s X, but I don’t want to confuse it with the X in this problem) with 3q elements, and a collection C of 3-element subsets of X.  Our alphabet X will have one symbol for each element of each set in C (So an element will be something like ci,j for element i of set j.).  Our alphabet Y will have one symbol for each element in S.

z will be the string s0s1..s3q  (all the elements of S in some arbitrary order).

R will be the regular expression “Strings of length 3q over X that for each set in C either does not use that set at all, or uses all three elements in it exactly once”.  This is the part I’m not confident in.  I’m sure this is a description of a regular expression.  What I’m not sure of is whether this expression can be expressed in a form that is polynomial in the size of S and C.  I don’t know enough about how to minimize regular expressions to be able to answer that.

Anyway, from there, each expression Ri generates the single symbol in Y that is the corresponding element of the set in X (so the expression for the element in X c1,5 would be whatever the first element of set 5 in C is).

The idea is that you generate w by finding the string z  that has the correct elements of the correct sets. The rules for how you can make z give you constraints that your solution is a legal X3C solution.

Difficulty: 9.  Maybe it should be a 10, since I’m not really confident this is correct.  The real trouble I had coming up with a reduction was that strings in regular expressions have a fixed order, but the X3C problem just wants sets, which have no orderings.  So you need a way to either for the single string w to stand for any arrangement of symbols in S.  And while “uses each symbol in Y exactly once” is a regular expression, I’m not sure that can be written in polynomial time relative to the size of S and C either.  The only way I can think to do it offhand is by choosing between all (3q)! permutations of the symbols.

Leave a Reply

Your email address will not be published. Required fields are marked *