# Tag Archives: 3SAT

## Code Generation With Unfixed Variable Locations

Sorry for the long break.  I’m hoping to use this summer to get most of the rest of the way through this book.

The Problem: Code Generation With Unfixed Variable Locations.  This is problem PO8 in the appendix.

The Description: Given a program (a sequence of instructions I1..In), a set V of variables, a “short address span” B, and a desired program size M.  We want to assign addresses to both the instructions of the program and the variables.  If an instruction accesses a variable that is within B instructions of its use, we can represent that instruction in one byte (using the “short address”).  Otherwise, the instruction is represented in two bytes (using the “long address”).  Can we place all instructions and variables in M addresses or less?

Example: Suppose I have the following instructions (A, B, and C are variables, “ref” just means the instruction refers to that variable)

```ref A
ref A
ref B
ref C
```

If B=1 (so all short references have to be within one instruction of the variable address), we can place all instructions and variables in 7 addresses, for example:

```ref A
A
ref A
ref B
B
C
ref C
```

If instead that second A reference was at the end of the sequence of instructions:

```ref A
ref B
ref C
ref A
```

..then (because we can’t change the order of the instructions in the program), we would have to make one of the references to A long, for example:

```ref A
A
ref B
B
C
ref C
ref A (long)
```

That second instruction that references A is a long reference, so this program needs 8 bytes of address space.

Reduction: The paper by Robertson uses 3SAT, So let’s start with a 3SAT instance with n variables and m clauses, each clause Cj contains literals Aj1, Aj2, and Aj3.Having just 1 or 2 literals in a clause is ok too.

Robertson notes that for any two literals V and W, if you have V and ~W appear as one (complete, two-element) clause, and also have ~V and W appear as a different clause, then V and W must both have the same truth value in order to satisfy both clauses.  This gets us a way to add more variables to the formula that have truth values the same as some other variable.  Specifically, if some variable V shows up in our formula more than 5 times, we can create a new variable W to replace three of those occurrences of V, and add two clauses to the end (the clauses (V ∨ ~W) and (~V ∨ W)).  This forces our new W to have the same truth value V will have if the formula is satisfied, and reduces the number of occurrences of V.  We repeat this process until no variable appears more than 5 times in the formula.  So, now we can say that for some variable Vi, it appears in clauses Ci1 through Ci5 (or maybe less than 5 if the variable occurs less than 5 times).

The program we will build will consist of “modules” that can be moved independently.  Each variable will have two modules (“t” and “f” modules), and each clause will have one.  Here is a picture of a variable “t” module:

“l” is the paper’s version of B- the maximum distance we can put a short reference to a variable.

What we’re seeing here is:

• A “slot” we can put a variable (or more than one) into
• 16 instructions that refer to a new variable ui
• 3 instructions that reference each of the variables corresponding to the (at most 5) literals and clauses the variable can appear in.

The “f” mode is similar except that the definition of τr is negated (it’s the positive variable if ~Vi occurs as a literal).  The new variable ui appears in the t and f modules for variable i and noplace else, so the only way references to it can be short is if it is placed into the “slot” in either the “t” module (meaning variable i is set to true) or the “f” module (meaning variable i is set to false).  If we put ui in the slot of (say) the “f” module, then we have enough room to also place the variables corresponding to the literals in the same block making everything short.  But by putting the ui variable in the “f” module, all of the references to it from the “t” module will be long (adding 16 extra bytes of address space), making all of the literals too far away from the slot for the slot to be useful in that module.

We also have a module for the clause (the “c”-module):

Notice that the versions of the literals that actually appear in the clause are to the right of the slot.  These α variables are the same ones we had in the variable modules (so each α variable shows up once in a “c” module, but 3 times each in a “t” or “f” module).  Since there are more references to the α variables in our variable modules, the optimal placement should be to place those variables in the slot of the “t” or “f” modules (not in the “c” module).  The “remaining” α variables (the negations of the ones we put into the slot in either the “t” or “f” modules and which it didn’t help us to put in the slot of the other module) can go into the slot here and make one of the corresponding β references short.

We do have that extra γj cell to play around with, and here is where either the paper has a mistake, or I lose the ability to follow.  What is supposed to happen is that if we fill the slot with a variable that is referred to by one of the β variables then that reference will be short, and the reference from γ to δ will be within the range of a short reference.  That’s fine, but the paper wants this to be the α variable that we did not place in the “t” or “f” module.  But if placing the α in the “t” module meant setting the variable to true, then what is not in the module is the negation of that module, which feels like we are doing things backward- like we want to set the variable to true, we should have put ui in the “f” module so that the literal that can be placed in our clause module corresponds to the true version of the literal.

I’m not sure it matters much to the reduction.  If I’m right, it just means we should have labeled the α variables the negations of what they really are.  But it does worry me.

The proof finishes with deciding values of B and M to make it all work.  B ends up being 31 (it comes from the distance we need to separate the 16 ui instructions from the τ instructions), and M ends up being 16n+4* the total number of literals in the formula.

Difficulty: 7.  This has a lot of pieces, but (except for that end part) I think it’s pretty feasible to see how all of the pieces fit together.

## Code Generation on a One-Register Machine

I’ve done a lot of traveling over the past month, so I’m sorry about missing posts.  Things should be back to normal now.

The problem: Code Generation on a One-Register Machine.  This is problem PO4 in the appendix.

The description: Given a directed acyclic graph G = (V,A), in which no vertex has an out-degree larger than 2, and a positive integer K.  The leaves of this graph (the vertices with out-degree 0) are our starting values, sitting in memory.  Can we compute the values in all root vertices  (with in-degree 0) in K instructions or less, if our only instructions are:

• Load a value from memory into our only register.
• Store a value from the register into memory
• Doing an operation combining a value in a register and a value in memory.  The operation must connect two children of a vertex in a graph together, and the “result” is the parent vertex.  The result value replaces the original value in the register.

Example:  Here’s a simple graph:

Here, we can compute the “+” node by:

• Doing the + operation between the register and 2
• Storing the + operation to memory (G&J’s definition of the problem says that the node is not computed until the value is stored)

We can compute the “-” node in another 3 instructions, and since the value of the “-‘ node is still in the register, compute the “*” node in 1 more instruction, and store it with out last instruction.

Here’s a more complicated graph:

To do this one, we will have to load the value in node 2 lots of times.  For example, here is the set of instructions I came up with to compute h:

• op to create a (only 1 operand)
• store a
• op to create c (3 is in memory)
• store c
• op to create d (4 is in memory)
• op to create f (c is in memory)
• op to create g (a is in memory)
• store g
• op to create b (3 is in memory)
•  op to create e (c is in memory)
• op to create h (g is in memory)
• store h

It’s possible that we can do this in fewer instructions, but hopefully, you can see why this problem is hard- knowing what value to keep in the register is tricky.

Reduction: G&J point out that the reduction is from a paper by Bruno and Sethi, which uses 3SAT to do the reduction.  The instance they build is pretty complicated.  I also came across a paper by Aho, Johnson, and Ullman, who extend the result of the Bruno and Sethi paper with a nice reduction from Feedback Vertex Set.  I think this reduction is easier to follow, so we’ll go with that.

So, we are given an instance of FVS- a directed acyclic graph G and an integer K.  We are looking for a set F of K vertices such that every cycle goes in G goes through some element of F.

We are going to build our Code Generation graph D as follows:

• For every vertex in G with outdegree d, build a “left chain” of d+1 vertices.  So if vertex a had 2 vertices leaving it, we will create 3 vertices b0, b1, and b2.  b2 will connect to b1, and b1 will connect to b0.
• Each of the “0” vertices at the bottom of these chains connects to 2 distinct memory values (they will be the leaves of the code graph)
• If vertex v has outdegree d, each vertex in a’s chain will connect to the different “0” vertex of the different neighbors of v in G.

Here is an example from the paper:

Notice that if we don’t have the edges between the chains, we can compute the entire chain with just 2 loads (of the leaves that start in memory).  So, the only loads needed to compute all of D happen in the leaves, or in some of the “level 1” vertices that are parents of the leaves.   If we have to re-load one of those vertices, it is because there is no optimal strategy to avoid loading it, which means it’s part of a cycle.

For example, look at the a and b chains in the picture above.  If we didn’t have any of the c or d vertices or edges in our graph, we could compute a1 and b1 without loading any vertex that is not a leaf: compute b0, b1, b2, then a0, then a1 (which uses a0 from the register and b0 from memory).  The reason we can do this is that while a1 depends on b0, none of the b vertices depend on anything in a, which gives us a chain to do first.  We need to reload a value when we have a circular dependency between chains (so there is no correct chain to do first).  That’s the relationship between the chains and the feedback vertex set.

This works in the other direction as well- if we are given the feedback vertex set in G, we can compute those vertices first in D, and then load them all once as needed to compute D.

The paper says that in the example graph, the node {d} by itself is a Feedback Vertex Set, and the optimal computation ordering is: d0,c0, c1, b0,b1, b2, a0,a1, d1.  That final d1 needs a re-load of d0.  The 1 extra load corresponds to the 1 vertex in our Feedback Set.

Difficulty: 6.  Maybe 7.  I think this is right at the limit of what a student can figure out, but I would also want a more rigorous proof about the connection between the extra loads and the feedback set, which is probably tricky to come up with.

## Tree Transducer Language Membership

Sorry for vanishing for so long- I was trying to track down the reference for this problem, which is from a Ph.D. thesis from 1977, so was hard to get.  I probably could have (or should have) moved on to the next problem while we were working on that, but doing these problems in order is too ingrained in me to change now.

The problem: Tree Transducer Language Membership.  This is problem AL21 in the appendix.

The description: Given a description of a Top-Down Finite State Tree Transducer (see below) T and a string in its output grammar w, is w generated by some initial string by T?

A Top-Down Finite State Tree Transducer (abbreviated “t-fst” by Reiss in his thesis) defines a tree for rewriting strings into other strings.  Each rule replaces a tree (or a subtree) with a new tree (or subtree).

Example: Here’s an example Reiss uses:

What you’re seeing here is a tree that can rewrite strings of the form an into strings of the form an^2.  The bottom part shows how this set of rewritings can turn the string “aa” into the string “aaaa”.  First, we apply the first rule (on our starting “q1” tree) into the second tree.  Then we have a second rule to replace a q1 tree with a single child and a single grandchild with the same tree without the q1.  We have similar rules to remove the q2 symbols in the tree.  The final tree is a derivation for “aaaa”.

The reason the capital “A” symbols are in the trees is because these trees are parse trees for context-free grammars.  In particular, these trees come from the CFG:

A->Aa | a

Notice though that our tree rewriting rules only turn certain parse trees into other parse trees.

So, an instance of our problem is: Given a result string (such as “aaaa”) does there exist some initial string (such as “aa”) that our tree rewriting rules can generate?  Reiss calls this the “Inverting” a t-fst.

Reduction: Reiss reduces from 3SAT.  Our 3SAT instance will have m variables and r clauses.  We will assume that each variable appears at most once in a clause, and that r is an exact power of 2 (r = 2k).  We can add dummy clauses to ensure this.

First, he defines a “standard” set of tree rewriting rules.  These rules are always the same and do not depend on our SAT instance.  The rules will take a string of the form 1k\$<variable settings>\$, where <variable settings> is a string of m “t” or “f” symbols corresponding to the settings of the variables.

The output of the transformations is a string built out of one substring for each clause: 0m+7\$<m a b or c symbols>.  The substrings for each clause are concatenated together.

Our problem instance is to start with a string in the form of this output transformation and see if an input string exists (and to show that one does if and only if the SAT instance was satisfiable).  Each variable contributes an a,b, or c symbol to the clause substring as follows:

• If the variable does not appear in the clause, we choose a.
• If the variable appears positively in the clause, we choose b.
• If the variable appears negatively in the clause, we choose c.
• We also reverse the ordering of the letters (so variable 1’s letter appears last)

So, suppose we have (v1, v2, ~v3) and (~v1, v3, v4) as two clauses.  Our initial string w would be: 00000000000\$acbb00000000000bbac

We’re looking for a string like 1\$tftt:  1’s equal to the log of the # of clauses, and then the truth values of the variables

Here’s another example from the thesis.  Note that each variable appears in each clause, so there are no “a” symbols:

If the formula is satisfiable, then we have an input string (from the settings of the variables) that will hopefully generate the output string.  The way the rewriting rules work is that the 1’s at the start of the string generate a subtree for each clause (copying all of our truth values), and then from there, we generate the string for each clause.

In each clause, we need to generate m+7 0 symbols as well as the a’s b’s and c’s that correspond to the clause.  Each of the a’s eventually maps to a 0 in our rewriting, which will give us m-3 of the 0’s we need- we still need to generate 10 more.  Assigning a literal to true will get us 3 0’s and setting it to false will get us 2 0’s.  So if the clause is satisfied, we will have 7-9 0’s, and we will have 6 0’s if the clause is not satisfied.  The replacement of the \$ can generate 1-3 more 0’s.  So if the clause is satisfied, we will be able to get our 10 0’s, but if it’s not, we will not be able to.

In the other direction, if some string exists that can generate our clause string w, we know it has to start with k 1’s, then a \$, then m t or f symbols.  The same logic will show that any string of that form that does not satisfy a clause will not generate the correct number of 0’s.  So whatever initial string was created to generate w has to be a way to satisfy our formula.

Difficulty: The hardest thing is to understand what is happening with the tree replacement.  Once you have that, the whole “figure out a way to handle 1-3 (but not 0) ways to satisfy a clause” is something that we’ve seen a few times.  So I guess I’d say a 8 for the actual reduction.