# Tag Archives: Difficulty 8

## Arc Coloring

Last week we did a subproblem along the way to the next problem in the appendix.  This week we are making a second stop.

The problem: Arc Coloring.  This problem is not in the appendix.

The description: Given a circle divided into m equally spaced points, and a set {A1..An} of “arcs”.  Each arc Ai is represented by a pair (ai, bi) signifying an arc that starts at point ai and continues clockwise to point bi

A circular arc graph is a graph built from this set of arcs, with one vertex for each arc, and an edge between two arcs if they intersect at one or more points (the starting point of an interval ai does not count for overlap purposes)

The arc coloring question asks: Given an integer K, and a circular arc graph G, can we color G in K or fewer colors?  Alternately, we are asking whether we can partition the given set of arcs into K or fewer groups with no intersections within any group.

Example: The paper by Garey, Johnson, Miller, and Papadimitriou gives this example of a circular arc graph:

The right side is the circle and the arcs, the left side is the graph.  So we can see there are edges connecting v1 to v2 and v5 because arc 1 overlaps with both arcs 2 and 5.

This graph can be colored with 3 colors (v2, v3, and v4 form a clique, so at least 3 colors will be needed, and coloring v1 the same color as v4 and v5 the same color of v1 will color the graph in just 3 colors).

Note that we already know that the general graph coloring problem is NP-Complete.  We are asking here whether this restricted kind of graph can be colored in polynomial time.  It will turn out that coloring this graph will be NP-Complete as well.

Reduction: The reason we talked about the Word Problem for Products of Symmetric Groups last time was to reduce from it here.

So we start with an instance of WPPSG- a collection of sets X1 through Xm, an integer K, and a goal permutation π.  We’ll assume that each number from 1 to K occurs in at least one X set (if not, if π moves that number we have an automatic “no” instance, and if it doesn’t, we can just remove the number from consideration (and shift all larger numbers down 1)).  We want to build a graph that is colorable in K colors or less if and only if our WPPSG instance is solvable.

Our circle will have K+m points on it.  Each set Xi will generate a set of arcs.  Let li[1] be the index of the first X set that contains i, let li[2] be the index of the second such set, and so on.  So using our example from last time, we’d say that l5[1] is 1 (because X1 is the first set to have a 5 in it), and l5[2] is 3 (because X3 is the second set to have a 5 in it).  We’ll denote the last such label as li[k(i)]

So, for each Xi we build a set of arcs:

• Ai1 = (i, K+li[1])
• Ai2 = (K+li[1], K+li[2])
• Ai,k(i) = (K+li[k(i)-1], K+li[k(i)])

This creates a set of arcs that are disjoint, and who cover the points from i to K+li[k(i)]

Each set Xi also creates an arc Ci = (K+li[k(i)], π-1(i)). The “π-1(i)” is just the inverse of the π permutation we started with, and tells us for each number from 1..K what position that number should end up in.  This Ci arc continues the Ai arcs from where they “end” and go around to π-1(i).

Here is an example of how this gets built from the paper:

The actual proof that this conversion works is hard and complicated.  The general idea is that for each point in the circle, we need a different color for each arc that crosses the point.  The different colors can map to the permutations of the numbers from 1-K.  They show that you can build a permutation of colors that works based on the permutations you can do on the X sets, and so you can get to a legal permutation if and only if you can get to a legal coloring.

Difficulty: 8.  The construction is cool, but hard to see from the initial problem.  The actual proof is very tricky as well.

## Feasible Register Assignment

On to a new section!  This section is on “Program Optimization”, and is the last “real” section- after this, there is a section of “Miscellaneous” problems and a section on “open” problems.  We’re heading towards the end!

PO1 is Register Sufficiency.  We’ll pick up with the next problem.

The problem: Feasible Register Assignment.  This is problem PO2 in the Appendix.

The description: Given a directed acyclic graph V = (G,A), a positive integer K, and a register assignment for each vertex (each vertex maps to one of K registers), can we create a computation for G (In the same sense as we did in Register Sufficiency) that also has the requirement that at each stage of the computation, we never have two vertices simultaneously assigned to the same register?

Thinking about this in terms of the “stones game” from the Register Sufficiency problem, we are saying that each vertex wants to use a specific stone, and so not only can we only use K stones (at most) at a time, we also can’t use the same stone for two different vertices that request that stone at the same time.

In the paper that has the reduction for this problem (and Register Sufficiency), Sethi makes the point that this means that we should keep a value in its register until all of its direct ancestors are computed.  So we are really asking if we can create a sequence of vertices where:

1. For all vertices u,v, and w in the sequence, if u is a direct descendant of w, and v appears between u and w, then u and v have different register allocations.
2. For all vertices u and v in the sequence, if u appears before v in the sequence, v is not a descendant of u

Example: Here is the example graph from Sethi’s paper that we used before:

Suppose we had the following allocation of registers to vertices (this comes from p. 245 of the paper):

 node c x t1 b t2 t3 a t4 register 2 3 2 1 2 2 3 1

This allocation works if we use the sequence given in the table (c, then x, then t1, and so on).  We can see property 1 satisfied, for example, between t3 and x.  x is a direct descendant of t3. and so every vertex between x and t3 in the sequence cannot use x’s allocated register 3.  The second property is also true- we never have an ancestor vertex in the graph appear in the sequence before its descendant.

Here is a register allocation and sequence that will not work:

 node c x b a t1 t2 t3 t4 register 2 3 1 3 2 2 2 1

The problem here is again between t3 and x.  In this sequence, node a also uses register 3 in between, which violates property 1.

In programming terms, what is happening is that the second sequence copies the value of a into register 3, destroying the value of x that was already in there.  But the graph says we needed x to compute t3, and so by the time we got to the part of the sequence where we computed t3, we no longer had x’s value in a register.

Reduction: Sethi goes from 3SAT.  Our 3SAT instance will have n variables and m clauses.

The basic idea is that we will construct a graph with:

• One leaf vertex for the positive and negative versions of each literal (nodes sk and sk for variable k).  These vertices will both be allocated to register “Sk
• For each clause, construct the following subgraph: The lowercase letters are the vertex names.  The capital letters are the register assignments of the vertices (so, the q vertices map to registers “Qi1” through “Qi3“, and so on.  Note that both the positive and negative rij vertices both map to the same register “Rij“.
• Since r vertices represent literals in clauses, connect from them to the s vertices that represent the variables represented by the literal.  In other words, If the literal rij (literal j of clause i) is positive and represents variable xk, we connect an edge from rij to sk and from rij to sk.  If the literal is negative, we connect from rij to sk and from sk to rij

Suppose the formula is satisfiable.  Then if variable xk is true in the satisfying assignment, we add the vertices sk, all of its direct ancestors, and then sk to the sequence.  (In order from k=1 to n).  If xk is false, we add sk, then all of its ancestors, then sk.

We know that all of these ancestors are “r” vertices in clauses that have the variable and  that those “r” vertices only have 1 descendant (the s vertex corresponding to the setting of the variable.)  We also know that for all i and j, rij and rij have different children (they’ll be different parities of the corresponding s vertices).  This means we thus far have only added one vertex out of each rij and rij. Since each pair shared a register allocation (and were the only vertices sharing that register allocation), we’re ok so far.

If we have added the “positive” rij vertex to the sequence, next we will add the parent pij to the sequence.  This will “release” the need for the register allocation for rij, allowing us to add rij to the sequence.

After doing that for all rij, we need to handle the situation where we added the “negative” rij to the sequence, but not the “positive” version yet.  It turns out that in each clause, since the clause is satisfiable, we must have set its rij in the sequence already.  Let’s pretend that the literal that is true is literal 1.  So we know we have ri1 (and thus pi1) in the sequence, and since we have all “negated” r vertices, we must already also have ri2 in the sequence.  This lets us allocate the vertex qi1, which lets us release the register allocation of Ri2 that was held by ri2, letting us add ri2 to the sequence, and then we can follow that process around the clause widget.  This will then let us add all Q vertices to the sequence, giving us a legal ordering of the vertices.

The other direction is a little more convoluted.  First, he shows that if we have a pair of “parent” vertices (u1 and u2) with the same register allocation, and those  parents each have child vertices (v1 and v2) with the same register allocation, then u1 comes before u2 in a legal sequence iff v1 comes before v2 in that sequence.  Since parents have to come after children n the sequence, the only for that not to happen is if we had a sequence v2, v1, u1, u2.  But that sequence forces us to visit both v1 and v2 while they are both holding the same register, which can’t happen.

Next, we look at a sequence for the graph we build from a satisfiability instance.  He shows that if we have a legal sequence there must be, for all i, some rij that appears before the corresponding rij.  If that wasn’t the case, we’d have no legal way to get to the Q vertices, since the rij will hold its register allocation the whole time, meaning we can’t ever put the positive rij in the sequence.

This vertex that has its rij first corresponds to a literal being set to true in a clause and gives us a way to satisfy the entire formula.

Difficulty: 8.  This is a pretty tricky construction, even if it is similar to other SAT reductions we’ve seen.  But the need to add the extra p vertex on the positive (but not negative) side, and the Q vertex having as descendants rij and ri(j+1) make this hard to follow.

## Tree Transducer Language Membership

Sorry for vanishing for so long- I was trying to track down the reference for this problem, which is from a Ph.D. thesis from 1977, so was hard to get.  I probably could have (or should have) moved on to the next problem while we were working on that, but doing these problems in order is too ingrained in me to change now.

The problem: Tree Transducer Language Membership.  This is problem AL21 in the appendix.

The description: Given a description of a Top-Down Finite State Tree Transducer (see below) T and a string in its output grammar w, is w generated by some initial string by T?

A Top-Down Finite State Tree Transducer (abbreviated “t-fst” by Reiss in his thesis) defines a tree for rewriting strings into other strings.  Each rule replaces a tree (or a subtree) with a new tree (or subtree).

Example: Here’s an example Reiss uses:

What you’re seeing here is a tree that can rewrite strings of the form an into strings of the form an^2.  The bottom part shows how this set of rewritings can turn the string “aa” into the string “aaaa”.  First, we apply the first rule (on our starting “q1” tree) into the second tree.  Then we have a second rule to replace a q1 tree with a single child and a single grandchild with the same tree without the q1.  We have similar rules to remove the q2 symbols in the tree.  The final tree is a derivation for “aaaa”.

The reason the capital “A” symbols are in the trees is because these trees are parse trees for context-free grammars.  In particular, these trees come from the CFG:

A->Aa | a

Notice though that our tree rewriting rules only turn certain parse trees into other parse trees.

So, an instance of our problem is: Given a result string (such as “aaaa”) does there exist some initial string (such as “aa”) that our tree rewriting rules can generate?  Reiss calls this the “Inverting” a t-fst.

Reduction: Reiss reduces from 3SAT.  Our 3SAT instance will have m variables and r clauses.  We will assume that each variable appears at most once in a clause, and that r is an exact power of 2 (r = 2k).  We can add dummy clauses to ensure this.

First, he defines a “standard” set of tree rewriting rules.  These rules are always the same and do not depend on our SAT instance.  The rules will take a string of the form 1k\$<variable settings>\$, where <variable settings> is a string of m “t” or “f” symbols corresponding to the settings of the variables.

The output of the transformations is a string built out of one substring for each clause: 0m+7\$<m a b or c symbols>.  The substrings for each clause are concatenated together.

Our problem instance is to start with a string in the form of this output transformation and see if an input string exists (and to show that one does if and only if the SAT instance was satisfiable).  Each variable contributes an a,b, or c symbol to the clause substring as follows:

• If the variable does not appear in the clause, we choose a.
• If the variable appears positively in the clause, we choose b.
• If the variable appears negatively in the clause, we choose c.
• We also reverse the ordering of the letters (so variable 1’s letter appears last)

So, suppose we have (v1, v2, ~v3) and (~v1, v3, v4) as two clauses.  Our initial string w would be: 00000000000\$acbb00000000000bbac

We’re looking for a string like 1\$tftt:  1’s equal to the log of the # of clauses, and then the truth values of the variables

Here’s another example from the thesis.  Note that each variable appears in each clause, so there are no “a” symbols:

If the formula is satisfiable, then we have an input string (from the settings of the variables) that will hopefully generate the output string.  The way the rewriting rules work is that the 1’s at the start of the string generate a subtree for each clause (copying all of our truth values), and then from there, we generate the string for each clause.

In each clause, we need to generate m+7 0 symbols as well as the a’s b’s and c’s that correspond to the clause.  Each of the a’s eventually maps to a 0 in our rewriting, which will give us m-3 of the 0’s we need- we still need to generate 10 more.  Assigning a literal to true will get us 3 0’s and setting it to false will get us 2 0’s.  So if the clause is satisfied, we will have 7-9 0’s, and we will have 6 0’s if the clause is not satisfied.  The replacement of the \$ can generate 1-3 more 0’s.  So if the clause is satisfied, we will be able to get our 10 0’s, but if it’s not, we will not be able to.

In the other direction, if some string exists that can generate our clause string w, we know it has to start with k 1’s, then a \$, then m t or f symbols.  The same logic will show that any string of that form that does not satisfy a clause will not generate the correct number of 0’s.  So whatever initial string was created to generate w has to be a way to satisfy our formula.

Difficulty: The hardest thing is to understand what is happening with the tree replacement.  Once you have that, the whole “figure out a way to handle 1-3 (but not 0) ways to satisfy a clause” is something that we’ve seen a few times.  So I guess I’d say a 8 for the actual reduction.