Category Archives: Appendix- Program Optimization

Programs With Formally Recursive Procedures

This is the last of the “Program Optimization” problems, and the next section is the last one on “Miscellaneous” problems, which has 19 problems in it.  So, unless some diversion into a problem not in the book happens, we’re on the countdown through the last 20 problems!  The end is in sight!

The problem: Programs with Formally Recursive Procedures.  This is problem PO20 in the Appendix.

The description: Given a set of function definitions and a program that makes a sequence of function calls using those definitions (the paper by Winklmann uses Algol because he wrote it in 1982.  But any language that allows recursion should work).   Are any of the procedures in the program formally recursive?  For our purposes, “formally recursive” means that there exists a chain of calls that has a call to a function higher in the chain (not necessarily a parent, any ancestor will do).

My first instinct was to wonder if this problem was actually undecidable.  It’s not because we will have some fixed # of procedures d, and so any call chain needs to be checked only to depth d+1- if we get that far, we must have a repeat someplace.  If we never get that far, we can back up and check other call chains, eventually checking all of the paths through the call tree exhaustively.  The proof that the problem is in NP is in the paper, but it involves representing the call tree as a DAG so we can use one vertex for two equivalent paths.  Winklmann says that if you do that the number of vertices is “bounded” by the number of procedures in the program (he doesn’t say what that bound is, though).

Example: It’s hard to make a small example that’s not obvious.  How about this:







If “F” is replaced by a call to C, the program is formally recursive, because the call chain main->A->C->A has a repeat.  If we replace that F with another call to B the program is not formally recursive because even though recursive functions like E exist (and even though B gets called twice in two different chains), there is no call chain that will start at main and hit E (or anything) more than once.

Reduction: Winklmann uses 3SAT.  So we start with a formula B with m clauses and n variables.  We’ll write some functions:

AssignXi( generates two calls to AssignXi+1. The function has one parameter for each clause.  The first call simulates setting variable Xi to true by passing a “true” value in all parameters that represent clauses that have variable Xi represented positively.  The second call simulates setting variable Xi to false by passing a “true” value in all parameters that represent clauses that have variable Xi returning negatively.  In all other cases, the function passes on the ci parameter it was given.

AssignXn( once we reach the end of the variables, we have to stop.  This function calls the “Check” function we’ll be defining in a minute twice, once reflecting setting variable Xn to true, one reflecting setting it to false.

Truej( is one “true” function for each clause, each one has a parameter for all clauses with larger numbers. The paper writes this function a little weirdly, but I think a good definition is:

if(cj+1 is true){

So, as long as the parameters are true, we’ll continue calling down through the  true functions to the end.  The last True function is what will make us formally recursive:

Truem(): Calls Assign1(False, False, ...False). Our main program will begin with a call to Assign1 with these exact parameters, so if we can get here, we will have found our recursive call.

The False functions end a call chain:

// do nothing

The Check function’s job is to start the appropriate call to True or False:

if(c1 is true)

Our main program is just a call to Assign1(False, False..False).  It’ll go through all of the various AssignXifuntions trying to assign variables (and thus clauses) to true. Each assignment will be checked, and each check will either hit a False function (meaning a clause wasn’t satisfied, and so we will backtrack and try an alternate assignment) or the final True function, which will generate our recursive call.  So this program has a recursive call chain if and only if the formula was satisfiable.

Notice that while the run of this program might take exponential time, that’s ok because reductions are concerned with the time it takes to create the new problem instance, and there are “just” 2m+n+1 functions to create.

Difficulty: 6.  This is a neat problem and reduction- I think the “write a backtracking program to try all combinations” is pretty gettable.  The hard part is the check and true/false functions at the end.  We can’t really use loops because loops make programs undecidable (you can’t show the problem is in NP by going down d+1 calls if you might have an infinite loop someplace and won’t get through all of those calls).  So we have to build a chain of recursive calls instead, with dead-ends to force the backtracking.

Non-Freedom for Loop-Free Program Schemes

Our old friend Bounded PCP is back for this reduction. What I actually originally wrote in that post was wrong (what I thought was the PCP reduction was actually their reduction for today’s problem), and I modified that post a little to reflect that.  I’m still sad that I can’t find an easy Bounded PCP reduction, but at least for today, all we need is the knowledge that it’s NP-Complete.

The problem: Non-Freedom for Loop-Free Program Schemes.  This is problem PO19 in the appendix.

The description: Suppose we have a set of instructions without loops (so just assignment of a variable, if statements that jump to other instructions, and halt.  Kind of like our Ianov Scheme instructions, but possibly allowing more than one variable).  “No loops” also means that our if statements cannot create a cycle.

Is there a directed path through this set of instructions that can never be followed by any actual computation?  (A program is “free” if all paths are reachable, so “non-free” means some path is not reachable).

Example: Here’s a very simple program:

1: if p then I2 else I3
2: x = 1
3: x = 2

If “p” is a predicate that is always true, than the path I1-I3 can never be realized and the program is non-free.  If p can be true or false, then all paths are possible and the program is free (and so this “non-free” instance is a “no”)

Reduction: As I alluded to at the start, Constable, Hunt, and Sahni reduce from Bounded PCP.  The idea is that they will take a BPCP instance and build a program where all paths are possible if and only if the BPCP instance is unsolvable.  The general idea is to build a scheme where we have 2 variables x1 and x2, which correspond to the two BPCP strings we are building.  At the end of the scheme, we have two predicates: P(x1) and P(x2).  If x1 and x2 are the same (which means we have built the same string) then we can’t get to a path that has true for one and false for the other.

The problem is that we could get that path if we choose a bad set of strings to make x1 and x2, and I’m not sure how we avoid that.  Here’s the picture they put in the paper basically as the entire proof:

I think maybe it might be an example for an instance of BCPP bounded at 1 string from each side (which is why the end case is a set of P2 predicates), but they don’t say and I’m not sure.  It’s not helped by the fact that they use “i” for the number of substrings used, and for an index in one of the strings, or that their font has the letter “l” look the same as the number “1”, or that they don’t explain anywhere why we have 2 levels of predicates, or what those predicates are supposed to be testing for.  I think the top predicates (the P1 ones) are choosing whether we take substring 1,2, and so on to build our solution.  But that means these “h” functions are appending the next character onto the string we’re building, and that’s not said anywhere either.

So, yeah, I don’t know what about this diagram doesn’t let me take incorrect choices for the BPCP problem, leading to a different x1 and x2, and creating a path that way. I definitely wish the paper had more explanation.

Difficulty: 8.  I’m not terribly filled with confidence that my explanation is right, because the explanation in the proof doesn’t really explain much to me.  Am I missing something?

Strong Inequivalence for Monadic Recursion Schemes

As promised, now we extend the proof of the Ianov Scheme problem to other problems.

The problem: Strong Inequivalence for Monadic Recursion Schemes. This is problem PO17 in the appendix.

The description: Suppose we have a sequence of expressions F1..Fn.  Each is of the form:

Fi(x) = if Pi(x) then Ai(x) else Bi(x)

Each Pi is a boolean predicate, each Ai and Bi are compositions of function calls from F or from a given set of “basis” functions f. This is a monadic recursion scheme.  A linear monadic recursion scheme adds the requirement that each A and B composition can have at most one call to one of our F functions.

So, similar to our last problem, if we are given a set of basis functions f, a set of predicates P, and two schemes that take a start variable x0 into an initial function F0, can assign functions and predicates to the two schemes to make them have different outputs?

Example: Let’s turn our Ianov Scheme example into this format:

Scheme 1:

F1(x): If P(x) then F2(x), else f( F2(x))

F2(x): If P(x) then x else x

Scheme 2:

F1(x) If P(x) then x else x

Scheme 2 is the identity function- it always returns x.  Scheme 1 is the identity when P(x) is true, or when f is itself the identity.  So if we have a predicate set of {p(x) = true, p(x) = false} and a function set of {f(x) = x+1}, then we can make Scheme 1 return a different value for its starting variable and thus the two schemes are equivalent.

Reduction: This is the paper by Constable, Hunt, and Sahni again, and in fact builds off of their Ianov Scheme result from last time.  The idea is that if we can take any program in that format, and convert it into a recursion scheme, then we have a reduction.   So, suppose we are at instruction “j” of the program, which will map to function Fj of our scheme. Recall those programs had three kinds of instructions:

  1. x=f(x), where f is some function- f will be a basis function in our recursion scheme, and we will create the statement:Fj(x) = If P1(x) then Fj+1(f(x)) else Fj+1(f(x)). This will take us to the next “line” of the program sending f(x) as the variable we are working on.
  2. If Pk(x) goto I1 else goto I2, then we create the statement:
    Fj(x) = If Pk(x) then FI1 else FI2. This will take us to the corresponding “line” of the program we will do next, based on our predicate Pk
  3. Halt.  Then we just do:
    Fj(x) = x

This turns any program into an equivalent monadic recursion scheme, thus converting our Ianov Scheme problem instance into a monadic recursion scheme instance.

Difficulty: 5.  The actual “Write this program as a set of function statements” task isn’t hard.  As usual, wrapping your head around the definitions of what the problems are talking about is the difficult part.

Strong Inequivalence of Ianov Schemes

I’d never heard of “Ianov schemes” before, but we’ll be using them for the next couple of problems, so let’s take a look!

The problem: Strong Inequivalence of Ianov Schemes.  This is problem PO16 of the appendix.

The description: Given two programs with a single variable x and instructions:

  1. x = f(x) (where f is some function from some given set of functions F)
  2. if p(x) goto I1 else goto I2 (p(x) is a boolean predicate from some given set of predicates P, I1 and I2 are instructions in the program)
  3. Halt

Are the two programs not strongly equivalent: Can we find a function from F for each f, a predicate from P for each p, and an initial value for each variable that gives different final values for the variable in both programs?

Example: Here is one similar to what we will need in a minute:


1: if p(x) then 4 else 2
2: x =  f(x)
3: Halt
4: Halt


1: Halt

Suppose the only choice for f is f(x)= x+ 1(so, changing the variable).  Then the programs are different if we have a p(x) that will let us get to line 2 in P1.  If our only choice of p is something like “p(x) = true”, then the two programs are always equivalent.  If our set of P is (p(x) = true, p(x) = false), then we can choose a predicate that makes the programs inequivalent.

Reduction: The paper from Constable, Hunt, and Sahni basically does this in two parts.  G&J say to use 3SAT, but this looks to me like DNF Non-Tautology.

First (this is Proposition 2.11), they show how to take a DNF formula and turn it into a program.  The program works as a giant chain of if statements checking each literal in each clause.  If a literal is checked to be false, we move on to the next clause.  If all of the literals in a clause are true, we make it to the first (“+”) halting state.  If none of the clauses make it to the “+” state, we go to the second (“-“) halting state.  This means that the “-” state is only reachable if we do not start with a tautology.

Next (This is part iii of Corollary 2.14 of Theorem 2.13), they show that they can extend this into an Ianov scheme program by making the “+” state be just a halt command (so the variable never gets changed), and the “-” state be:

1: x= f(x)  (Where f is some function that changes x)
2: halt.

So the only way this program is different from the trivial “just halt” program is if we can reach the “-” instruction, which only happens if the formula is not a tautology.

Difficulty: 5. The program to test the non-tautology is not that hard.  As usual, the hardest part is parsing all of the definitions in the problem

Inequivalence of Simple Functions

This paper (also referenced in our last problem) is the first one in the Bibliography whose first author started with “T”.  It feels weird that it took so long to get that letter.  Now the only letters missing in the Bibliography are “Q” (though I have a direct proof from Maurice Queyranne in the K-Closure post),  “X”, and “Z”.  I’m not sure I’ll get any “X” or “Z” papers, we’ll see.

The problem: Inequivalence of Simple Functions.  This is problem PO15 in the appendix.

The description: A simple function is a composition of basic functions on non-negative integers from:

  • s(x) = x+1
  • plus(x,y) = x+y
  • sub(x) = x-1
  • selecti (x1..xn) = xi
  • div(x,t) = [x/t] (integer division where t is a constant positive integer)
  • w(x,y) = if(y == 0) then x else 0
  • mod(x,t) = The remainder of dividing x by t (t is a constant positive integer)

We are given two simple functions f and g over some set of initial variables X.  Can we provide starting values in X that give different outputs for f and g?



f(x,y) = s(s(s(x)))

(This returns x+3)

g(x) = s(s(s(w(x,y))))

(This returns x+3 if y is 0, but 3 if y is not 0)

On any input pairs (x,y) where y is 0, f and g will return the same value.  But the functions are inequivalent because f(2,7) = 5 but g(2,7) = 3.

Reduction: Tsichritzis uses Inequivalence of Loop Programs Without Nesting, which we did last time.  The idea is to show that all such program actually define simple functions (and so therefore if we can find the simple function derived from a program, that is our reduction).

First, he shows that a program without loops either returns xi + k, or k, where xi is some variable and k is a constant.  The idea is that since we can only do three things, each command can be rewritten from the top down:

  • If the instruction you are looking at is an assignment X=Y, then replace that with X=A (where A is whatever value Y is.  Either it has been computed previously in the program, or is Y’s starting value
  • If the instruction you are looking at is an increment X=X+1, then replace that with X=A+1, (where A is whatever the previous value of X is.  Either it has been computed previously in the program, or is X’s starting value)
  • If the instruction is X=0, then remember that 0 is the most recent previous value of X.

At the end, we look at what the output variable has been updated to.  It can only have been changed by some amount of additions to some variable’s start value, or by some amount of additions to 0.

So what is left is what goes on inside a loop.  Since loops cannot be nested, we know that the body of the loop only has the basic instructions.  Thus, all a loop can do is, for each variable Yi:

  1. Leave it alone, so, if yi was the contents of the variable at the beginning of the loop, Yi = yi at the end of the loop.
  2. Reset the value to some new constant k
  3. Set it to some other variable Yj‘s initial value yj plus some constant k.  We’ll say in this case that variable Yi links variable Yj

It’s possible for this set of linking to form a cycle.  It’s also possible for variables to depend on variables inside a cycle, but not to be in the cycle themselves.  The third possibility is for as variable to not care about a cycle at all and be independent.  The values of the independent variables are easy to express as simple functions.

If we have a cycle, the paper shows how the changes to the variable in one iteration can get added together by everything in the cycle, and then multiplied by the number of iterations the cycle goes.  These can also be represented as simple functions.

Difficulty: 6  The actual task we’re trying to do (“represent this program as a simple function”) is pretty easy to see.  The details of how to write the simple functions are pretty finicky, though.


Inequivalence of Loop Programs Without Nesting

This is another one of those “mainly ignore most of the problem” reductions.

The Problem: Inequivalence of Loop Programs Without Nesting.  This is problem PO14 of the appendix.

The description: Given:

  • a set of X variables,
  • a subset Y of X which are “input” variables,
  • a specific output variable x0
  • Two programs P1 and P2 with the commands:
    • x <- y (x gets a copy of y)
    • x <- 0  (x gets set to 0)
    • x<- x + 1 (x is incremented by 1)
    • DO xi {S1 S2..Sn} END  (see below)

.. is there a way to assign the variablse in Y so that P1 and P2 produce different outputs in our output variable x0?

Example: The semantics of the “DO” statement are (they’re better described by Tsichrtizis in his paper) that we execute the loop body a number of times equal to the value of xi (if it is 0, we do not execute the loop at all).  The Si statements can be any program statement except another loop (since we are looking at programs without nesting- even allowing one level of nesting makes this problem undecidable).

So, here are two programs x1 is our input variable and x0 is our output variable:


x0 <- 0
DO x1
x0 <- x0 + 1


x0 <- x1

These programs are equivalent as long as we do not allow our input values to be negative (which seems to be the convention) because both programs will copy the integer value in x1 into x0 (though P1 does it more slowly).  Suppose we used the alternate P1


x0 <- 0
DO x1
x0 <- x0 + 1
x0 <- x0 + 1

..this has the effect of making x0‘s value twice the value of x1.  While P1 and P2 will still have the same value in x0 if x1=0, any other value in x1 will result in different outputs, making our programs inequivalent.

Reduction: Constable, Hunt, and Sahni use 3SAT.  The basic idea is to use loops that execute at most once as if statements.  (I don’t think they have a loop that iterates more than once in the whole program).  So for example, they will set up a clause variable Ci that will be set to 1 if any of its literal variables are true with a subprogram like:

Ci <- 0
DO Ci1  (Ci1 is the first literal of clause i)
Ci <- 1 (Technically this is Ci <- 0 followed by Ci <- Ci +1)

DO Ci2
Ci <- 1

DO Ci3
Ci <- 1

..each of the loops will set Ci to 1 if the literal is true, and leave it alone if the literal is false.

So, with that in mind, we take a 3SAT formula and write a program:

  • Our input variables are the variables to the formula.
  • We create the negations of all of the variables in an if statement  (this way we have both positive and negative literals for each variable)
  • We compute all of the Ci variables as above
  • We compute the negations of all of the Ci variables. (we need this because our loop-based if statement can only do things if a variable is 1, not 0)
  • We create a variable P that is initially 1.  If any of the negated Ci variables are 1 (so the regular Ci variable was 0, so the clause was not satisfied), we reset P to 0.  P is our output variable.

This program will output P to 1 if its inputs form a satisfiable arrangement of variables, and will output 0 otherwise.  This is our P1 program.

Our P2 program is a program that always outputs 0.  P1 and P2 will be different if and only if the formula is satisfiable.

Difficulty: 5. The reduction itself is pretty easy- “write a program that sees if the inputs are satisfiable”.  The hard part is realizing that you don’t really use the loop structure as a loop, and that you aren’t searching for an input to make it satisfiable, you’re just using whatever input the SAT instance gives you.

Inequivalence of Finite Memory Programs

Let’s see if I can build my own reduction for this one.

The problem: Inequivalence of Finite Memory Programs.  This is problem PO13 in the appendix.

The description: Given a finite set X of variables, a finite set of constant symbols Σ, two programs P1 and P2, each made with a sequence of instructions I1..Im (possibly of different lengths in both programs)

  • read xi
  • write vi
  • xi <- vj
  • if vi = vj then goto Ik
  • accept
  • halt (“halt” tends to mean “reject”)

Each xi is a variable from X.  Each vj is either a variable from X, a symbol from Σ, or a special “end marker” symbol $.  Ik is an instruction in the program.

Is there an initial assignment of values to all variables in P1 and P2 (the same assignment to variables in both programs) that leads to different output values of variables in both programs?


Here are two programs:


I1: X1 <- 2
I2: write X1
I3: accept


I1: if X1 = X1  goto I4
I2: X1 <- 3
I3: if X1=X1 goto I5
I4: X1 <- 2
I5: write X1
I6: accept

Despite the two programs being different, P2 will always go to I4 in the first if statement, and then both programs set X1 to 2, write the new value, and halt so that both programs will produce the same output on all inputs.

If we change P2’s first instruction to:

I1: if X1=1 goto I4

..then the programs will produce different outputs whenever X1 is not 1.

Reduction (from the paper): The paper by Jones and Muchnick uses Linear Bounded Automaton Acceptance.  They first show that a program in this format can be represented by a “Deterministic Generalized Sequential Machine” (DGSM), which works like a finite automaton that can output any string on a transition.  (I’m getting the description for this from my (old, 1979) copy of Hopcroft and Ullman, section 11.2.).

The DGSM has a state for each pair of:

  • Labels of read statements in the program
  • All possible strings of length m (the number of variables) containing either a symbol from Σ, or $.

In addition to all of those 2-tuples, we have a state qa (the accepting state) and qh the halting state.

The start state is the tuple (instruction of the first read in the program, whatever is in the variables at that point of the program).

The paper goes on to claim that once we have this, the “transition function can be defined in a totally finite manner”, and then claims that once we do this, we can use another result that says that inequivalence for DGSM’s is NSPACE(log(n))-complete.

I’m a little worried that the number of states in our DGSM is exponential in the size of the number of variables.  Maybe that doesn’t matter because this is just proving the existence of the DGSM?

Reduction (my try): So let’s see if I can try to create a reduction myself.

Suppose we have an LBA M, with its starting input x.  We’ll build a program in our language to simulate its behavior.  We’ll store one variable for each tape cell and an extra variable “xt” holding the (integer) position of the tape head.

We can simulate looking at a tape cell with:

if xt = 1 then goto an instruction that looks at x1
if xt = 2 then goto an instruction that looks at x2
.. and so on

This will increase the size of the  program by a linear amount (because the tape cells only use a linear amount of space)

So now we can simulate our machines using these programs and if our programs are inequivalent, then the LBA’s we started with are as well.

I don’t know.  It feels too easy.

Difficulty: 8.  It might be less if you can provide a clear description of what the instructions in the language does.  The paper describes a configuration of a program as an instruction to perform next, and two strings.  One of the strings represents the contents of the variables.  The other string I think is related to what the read and write instructions do, but it is hard for me to see what exactly those instructions are doing.  So I may have missed something.

Inequivalence of Programs with Arrays, Inequivalence of Programs with Assignments

Both of these problems are from the same paper and use basically the same proof, so we’ll do them together.

The Problems: Inequivalence of Programs With Arrays (PO11 in the appendix) and Inequivalence and Programs with Assignments (PO12 in the appendix)

The Description (for Inequivalence of Programs With Assignments): 

Given two programs P and Q that access a common set of variables, S.  Each statement in the program is of the form “x0 <- if x1 = x2 then x3 else x4“, where each xi is a variable from S.  Will the two programs always give different output for all initial variable configurations?

The Description (for Inequivalence of Programs With Assignments):

Same idea, but the possible statements in the program are:

  • Select an element B of an array α: A <- α.B
  • Update an element B of an array α: α.B <- A
  • “Operate” on an operation Φ: A <- Φ B1B2..Bk.  Φ comes from a given finite set of operations and is expected to take a certain number of arguments (k) that we know ahead of time.

Example: The “outputs” of these programs are the final values of the variables.  They denote a “final” value by representing a program as a Directed Acyclic Graph, and the leaf nodes (of outdegree 0) are the variables with final values.  We will specify what variables we care about as output variables

So for example, we might have the one-line program:

k <- if i=j then 2 else 3

Compared to the one-line program:

k <- if i=j then 3 else 2

If k is our output variable, then for all values of i and j both programs will produce different outputs.  If we change our second program to:

k<- if i=j then 2 else 2

…then if we assign i and j the same values, both programs will give the same output.

Reduction (for Inequivalence of Programs With Assignments): Downey and Sethi use 3SAT.  If we start with a formula with variables y1..yn we will create a program with variables {T,F, A1..An, B1..Bn,).  Then we just write a program that checks if all clauses are true.  “D” will be our output variable and will be updated for each clause.  We start it as true initially:

D <- if T = T then T else T // D is true so far

Then we check the first clause.  A variable “C” will be used for each clause.  It starts out false and will be set to true once the clause is satisfied:

C <- if F = F then F else F // C is false so far

Then for each literal x1..x3 we check if it has been made true.  We use Ai if the literal is positive in the clause and Bi if the literal is negative in the clause.  So suppose our clause was (y1, ~y2, y4), we’d have the lines:

// does the first literal make the clause true?
C <- if A1 = T then T else C
// does the second literal make the clause true?
C <- if B2 = T then T else C
// does the third literal make the clause true?
C <- if A4= T then T else C

C now is true if and only if the clause is satisfied, so let’s update D:

// D becomes false if we don't satisfy that clause
D <- if C = T then D else F

We then repeat that process for each clause.  At the end, we add rules saying that a variable and its negation can’t both have the same value:

D <- if A1 = B1 then F else D

..repeated for each variable.

The final output of this program is D, which will be true if our inputs legally satisfied the formula, and false otherwise.

Our second program just outputs the constant “F”.  The programs will produce different outputs if and only if there is a way for the first program to produce a T, which happens when D is true, which happens when all clauses are true, which happens when the formula is satisfiable.

Reduction (Inequivalence of Programs With Arrays): The same program works, we just need to replace our assignment if operations.  We do this by placing all of the elements of what we are comparing into an array, and replacing the statement “E <- if A=B then C else D” with:

α.A <- D
α.B <- C
E <- α.A

..if A and B are the same, then we will fill that spot in the array with D then C, and get a C out.  Otherwise, the two assignments go to different locations, and E is set to what is in the array at position A, which is the value D.  Since we can go back to the previous reduction and replace all of its statements with 3 statements under the new rules, the reduction works in the same way.

Difficulty: 6.  This isn’t a hard reduction to do, I think the hardest thing is to get students to think in terms 0f the restrictions these programs require.  There are no loops or anything (that would make this problem undecidable, at least in general), and representing variables as positions in an array is something we might be used to doing if all variables are integers, but since variables here are things like “B2” some work might need to be done to convince students that the mapping to integers for all variables is reasonable.  It’s not a thing we do as much with modern high-level languages as we might have done in the 1970s when the underlying machine was much more exposed.

Microcode Bit Optimization

The Problem: Microcode Bit Optimization.  This is problem PO10 in the appendix

The description: We have a set A of “microcommands”.  We take groups of these commands to create a set C of “microinstructions”.   We would like to partition our set A of commands into groups where each command uses at most one instruction from each group. Give an integer K, can we make such a split of A (into groups A1..An) such that \sum_{i=1}^n \lceil log_2(|A_i|+1 \rceil \leq K?

(I’ll note here that the definition in G&J requires this split into groups to be a partition into disjoint sets, while the paper we will be using allows the split to be a covering where a microcommand can appear in multiple Ai)

Example: The log part is there because we can represent groups as binary numbers.  The idea is that if we only have one group, we can represent microinstructions as a binary string of which microcommands we are using.  For example, if we make every microcommand its own group, then a microinstruction can be represented in binary: 101 might mean microcommand #5.  This makes our microinstructions smaller, but each microinstruction can only represent one microcommand.

In the other extreme, we can represent a microinstruction as a binary string of what microcommands are being used. In this model. the instruction “10101” says that we are using the first, third, and fifth microcommand, but not any of the others.  This lets us have longer microinstructions that use several microcommands at once.

So the problem is asking: How many groups are needed to minimize the representation of the microinstructions and mirocommands in a given program?  So, for example, suppose we had 8 microcommands and the following microinstructions (written as a binary string of what microcommands are used):

  • 1000 1111
  • 0100 0011
  • 0010 1100
  • 0001 0101

Notice that our 4 microinstructions never have 2 microcommands used from the 4 leftmost commands at the same time, so we can represent that chunk of the instruction as 2 bits (counting from 0-3, representing what command we are using).  We will need 4 bits to represent the rightmost 4 microcommands because all of those microcommands are in use in the first microinstruction.  But if we change the last 4 bits of the first instruction to 1011, then no instruction uses both the second and third commands out of those 4 bits, and we could represent those 2 commands using 1 bit (maybe a 0 means we use the second microcommand, and a 1 means we use the third one)

Reduction: G&J say to use 3DM, but the paper I found by Robertson uses CNF-SAT.  We assume that no clause has both a literal and its negation (because if it did the clause is automatically true), and also we never have a literal appear both with a variable and its negation in different clauses (we can add extra variables to remove this situation).

Suppose our formula has clauses T1..Tp and variables x1..xk.  Each of these will be its own microcommand.

The microinstructions will be:

  • An instruction {xi, xj} for all i < j.
  • An instruction {xi, Tj} if xi (or its negation) is not in clause Tj
  • An instruction {Ti, Tj} if those 2 clauses do not share any literals (they might share variables)

Define a maximal compatibility class (MCC) as a set of mutually compatible instructions that cannot add any other instruction without breaking compatibility.  Our job is to make these classes (the bigger the classes, the more items we can put in each subset)

We can make an MCC Ri out of xi and all of the Tj‘s that have xi in the clause.  We can make a second MCC Si out of xi and all of the Tj‘s that have ~xi in the clause.  Our goal is to use just k of these classes (one for each variable) to cover all of the instructions.

So, suppose the formula is satisfiable.  Then for all variables that are set to true in the satisfying arrangement, we use the Ri class of the variable, and for all variables that are set to false, we use the Si class for that variable.  Since we have chosen either Ri or Sfor each variable, each xi is covered.  Since the formula is satisfiable, all Ti are also covered by whatever variable makes the clause true.

In the other direction, if we have exactly k R’s and S’s and cover all of the T instructions, then the choice of R or S for each variable will give a consistent way to satisfy each clause.

We are not quite done, because of the requirement to use the log formula to measure the size of the sets, instead of just minimizing the number of the sets.  The paper goes into the algebraic details of manipulating these sets to satisfy those requirements.

Difficulty: 8. The log formula stuff is almost longer than the rest of the proof, which is annoying.  Other than that, the reduction is relatively straightforward, I think.

Ensemble Computation

This next one is done as an example of “Local Replacement” in the main part of the G&J book, but it’s pretty complicated, so I figured it was worth a post.

The problem: Ensemble Computation.  This is problem PO9 in the appendix and is done on pages 66-68 of G&J.

The description: Given a set A, and a collection C os subsets of A, and a positive integer J.  Can we find a sequence z1..zj, of at most J elements, where:

  • Each zi is formed as the union of 2 disjoint sets xi and yi
  • Each xi and yi is either (the set containing) a single element of A, or a zk for some k < i
  • Every set in C appears as a zi someplace?

Example: G&J say that this problem comes out of a desire to minimize the number of multiplications needed to compute a set of products- the idea being that the union of 2 sets works like the multiplication of the sets with those terms.  So let’s suppose A = {a,b,c,d,e}, and C = {ab, bcd, ce}

(Really C should be a collection of sets like {a,b}, but I think that thinking of it as multiplication is helpful)

Then a sequence that might get everything in C might be:

  • z1 = ab
  • z2 = bc
  • z3 = bcd
  • z4 = ce

..which is the best we can do.  But now imagine that the second element of C was “abc” instead of “bcd”.  Then we can make the sequence:

  • z1 = ab
  • z2 = abc
  • z3 = ce

.. Using the z1 result as an operand to create our z2, enabling us to do it in fewer steps.

Reduction: G&J use Vertex Cover.  So we’re given a graph G=(V,E) and an integer K.  Our set A will be the set of all vertices in V, plus a new element a0.  For each edge (u,v) in E, we will create a set {a0, u, v} in C.  Our bound J will be set to K + |E|.

Suppose G has a vertex cover of size K (we’ll call them v1 .. vk).  Then we know that each edge in E can be built using at least one of the VC vertices.  So the following sequence will work and be of size J:

  • z1 through zk are formed by unioning a0 with the VC vertex vi (i=1..k)
  • Each of the remaining unions will make an element of C (based on an edge in E) by unioning the z set that has the VC vertex in the edge with the vertex in A that has the other end of the vertex.

In the other direction, suppose we have a sequence of unions z1..zj that solves the Ensemble Computation problem.  There may be multiple possible solutions to this problem, let’s look at the shortest sequence, and also the one that has the least number of operations that make a z set out of 2 elements of V (and not a0)  It turns out that since we have a minimum length sequence, the only reason to union 2 elements u and v from V together would be because (u,v) is an edge in E, and we want to union in a0 later to make the set {a0, u, v} in C.  But then we can rearrange the order and union a0 with u first, and union that result with v to get the same element of C in the same number of steps.  So this means that our sequence has 2 kinds of operations:

  1. Unions of some vertex with a0
  2. Unions of one of the above sets with a vertex to make an element in C.

We know that there are |E| of the second kind of union (because there are |E| elements in C), and so there must be K of the first kind of union.  So the vertices we union directly with a0 must be of size K, and since one of them gets combined in each set in C (and thus is a part of each edge in E), must be a vertex cover.

Difficulty: 6. This is one of those “the description is harder than the actual problem” problems.  All of the talk of sequences and unions and things looks scary, but the actual problem isn’t that bad.  As for the actual reduction, I think it would be a good hard homework problem.  One trap I see is that the need for a0 might not be obvious (it’s there to force us to “choose” the VC vertices and commit them to be a step in our sequence).  So maybe that needs to be a hint.