# Monthly Archives: June 2022

## Inequivalence of Programs with Arrays, Inequivalence of Programs with Assignments

Both of these problems are from the same paper and use basically the same proof, so we’ll do them together.

The Problems: Inequivalence of Programs With Arrays (PO11 in the appendix) and Inequivalence and Programs with Assignments (PO12 in the appendix)

The Description (for Inequivalence of Programs With Assignments):

Given two programs P and Q that access a common set of variables, S.  Each statement in the program is of the form “x0 <- if x1 = x2 then x3 else x4“, where each xi is a variable from S.  Will the two programs always give different output for all initial variable configurations?

The Description (for Inequivalence of Programs With Assignments):

Same idea, but the possible statements in the program are:

• Select an element B of an array α: A <- α.B
• Update an element B of an array α: α.B <- A
• “Operate” on an operation Φ: A <- Φ B1B2..Bk.  Φ comes from a given finite set of operations and is expected to take a certain number of arguments (k) that we know ahead of time.

Example: The “outputs” of these programs are the final values of the variables.  They denote a “final” value by representing a program as a Directed Acyclic Graph, and the leaf nodes (of outdegree 0) are the variables with final values.  We will specify what variables we care about as output variables

So for example, we might have the one-line program:

k <- if i=j then 2 else 3

Compared to the one-line program:

k <- if i=j then 3 else 2

If k is our output variable, then for all values of i and j both programs will produce different outputs.  If we change our second program to:

k<- if i=j then 2 else 2

…then if we assign i and j the same values, both programs will give the same output.

Reduction (for Inequivalence of Programs With Assignments): Downey and Sethi use 3SAT.  If we start with a formula with variables y1..yn we will create a program with variables {T,F, A1..An, B1..Bn,).  Then we just write a program that checks if all clauses are true.  “D” will be our output variable and will be updated for each clause.  We start it as true initially:

D <- if T = T then T else T // D is true so far

Then we check the first clause.  A variable “C” will be used for each clause.  It starts out false and will be set to true once the clause is satisfied:

C <- if F = F then F else F // C is false so far

Then for each literal x1..x3 we check if it has been made true.  We use Ai if the literal is positive in the clause and Bi if the literal is negative in the clause.  So suppose our clause was (y1, ~y2, y4), we’d have the lines:

// does the first literal make the clause true?
C <- if A1 = T then T else C
// does the second literal make the clause true?
C <- if B2 = T then T else C
// does the third literal make the clause true?
C <- if A4= T then T else C 

C now is true if and only if the clause is satisfied, so let’s update D:

// D becomes false if we don't satisfy that clause
D <- if C = T then D else F

We then repeat that process for each clause.  At the end, we add rules saying that a variable and its negation can’t both have the same value:

D <- if A1 = B1 then F else D

..repeated for each variable.

The final output of this program is D, which will be true if our inputs legally satisfied the formula, and false otherwise.

Our second program just outputs the constant “F”.  The programs will produce different outputs if and only if there is a way for the first program to produce a T, which happens when D is true, which happens when all clauses are true, which happens when the formula is satisfiable.

Reduction (Inequivalence of Programs With Arrays): The same program works, we just need to replace our assignment if operations.  We do this by placing all of the elements of what we are comparing into an array, and replacing the statement “E <- if A=B then C else D” with:

α.A <- D
α.B <- C
E <- α.A

..if A and B are the same, then we will fill that spot in the array with D then C, and get a C out.  Otherwise, the two assignments go to different locations, and E is set to what is in the array at position A, which is the value D.  Since we can go back to the previous reduction and replace all of its statements with 3 statements under the new rules, the reduction works in the same way.

Difficulty: 6.  This isn’t a hard reduction to do, I think the hardest thing is to get students to think in terms 0f the restrictions these programs require.  There are no loops or anything (that would make this problem undecidable, at least in general), and representing variables as positions in an array is something we might be used to doing if all variables are integers, but since variables here are things like “B2” some work might need to be done to convince students that the mapping to integers for all variables is reasonable.  It’s not a thing we do as much with modern high-level languages as we might have done in the 1970s when the underlying machine was much more exposed.

## Microcode Bit Optimization

The Problem: Microcode Bit Optimization.  This is problem PO10 in the appendix

The description: We have a set A of “microcommands”.  We take groups of these commands to create a set C of “microinstructions”.   We would like to partition our set A of commands into groups where each command uses at most one instruction from each group. Give an integer K, can we make such a split of A (into groups A1..An) such that ?

(I’ll note here that the definition in G&J requires this split into groups to be a partition into disjoint sets, while the paper we will be using allows the split to be a covering where a microcommand can appear in multiple Ai)

Example: The log part is there because we can represent groups as binary numbers.  The idea is that if we only have one group, we can represent microinstructions as a binary string of which microcommands we are using.  For example, if we make every microcommand its own group, then a microinstruction can be represented in binary: 101 might mean microcommand #5.  This makes our microinstructions smaller, but each microinstruction can only represent one microcommand.

In the other extreme, we can represent a microinstruction as a binary string of what microcommands are being used. In this model. the instruction “10101” says that we are using the first, third, and fifth microcommand, but not any of the others.  This lets us have longer microinstructions that use several microcommands at once.

So the problem is asking: How many groups are needed to minimize the representation of the microinstructions and mirocommands in a given program?  So, for example, suppose we had 8 microcommands and the following microinstructions (written as a binary string of what microcommands are used):

• 1000 1111
• 0100 0011
• 0010 1100
• 0001 0101

Notice that our 4 microinstructions never have 2 microcommands used from the 4 leftmost commands at the same time, so we can represent that chunk of the instruction as 2 bits (counting from 0-3, representing what command we are using).  We will need 4 bits to represent the rightmost 4 microcommands because all of those microcommands are in use in the first microinstruction.  But if we change the last 4 bits of the first instruction to 1011, then no instruction uses both the second and third commands out of those 4 bits, and we could represent those 2 commands using 1 bit (maybe a 0 means we use the second microcommand, and a 1 means we use the third one)

Reduction: G&J say to use 3DM, but the paper I found by Robertson uses CNF-SAT.  We assume that no clause has both a literal and its negation (because if it did the clause is automatically true), and also we never have a literal appear both with a variable and its negation in different clauses (we can add extra variables to remove this situation).

Suppose our formula has clauses T1..Tp and variables x1..xk.  Each of these will be its own microcommand.

The microinstructions will be:

• An instruction {xi, xj} for all i < j.
• An instruction {xi, Tj} if xi (or its negation) is not in clause Tj
• An instruction {Ti, Tj} if those 2 clauses do not share any literals (they might share variables)

Define a maximal compatibility class (MCC) as a set of mutually compatible instructions that cannot add any other instruction without breaking compatibility.  Our job is to make these classes (the bigger the classes, the more items we can put in each subset)

We can make an MCC Ri out of xi and all of the Tj‘s that have xi in the clause.  We can make a second MCC Si out of xi and all of the Tj‘s that have ~xi in the clause.  Our goal is to use just k of these classes (one for each variable) to cover all of the instructions.

So, suppose the formula is satisfiable.  Then for all variables that are set to true in the satisfying arrangement, we use the Ri class of the variable, and for all variables that are set to false, we use the Si class for that variable.  Since we have chosen either Ri or Sfor each variable, each xi is covered.  Since the formula is satisfiable, all Ti are also covered by whatever variable makes the clause true.

In the other direction, if we have exactly k R’s and S’s and cover all of the T instructions, then the choice of R or S for each variable will give a consistent way to satisfy each clause.

We are not quite done, because of the requirement to use the log formula to measure the size of the sets, instead of just minimizing the number of the sets.  The paper goes into the algebraic details of manipulating these sets to satisfy those requirements.

Difficulty: 8. The log formula stuff is almost longer than the rest of the proof, which is annoying.  Other than that, the reduction is relatively straightforward, I think.

## Ensemble Computation

This next one is done as an example of “Local Replacement” in the main part of the G&J book, but it’s pretty complicated, so I figured it was worth a post.

The problem: Ensemble Computation.  This is problem PO9 in the appendix and is done on pages 66-68 of G&J.

The description: Given a set A, and a collection C os subsets of A, and a positive integer J.  Can we find a sequence z1..zj, of at most J elements, where:

• Each zi is formed as the union of 2 disjoint sets xi and yi
• Each xi and yi is either (the set containing) a single element of A, or a zk for some k < i
• Every set in C appears as a zi someplace?

Example: G&J say that this problem comes out of a desire to minimize the number of multiplications needed to compute a set of products- the idea being that the union of 2 sets works like the multiplication of the sets with those terms.  So let’s suppose A = {a,b,c,d,e}, and C = {ab, bcd, ce}

(Really C should be a collection of sets like {a,b}, but I think that thinking of it as multiplication is helpful)

Then a sequence that might get everything in C might be:

• z1 = ab
• z2 = bc
• z3 = bcd
• z4 = ce

..which is the best we can do.  But now imagine that the second element of C was “abc” instead of “bcd”.  Then we can make the sequence:

• z1 = ab
• z2 = abc
• z3 = ce

.. Using the z1 result as an operand to create our z2, enabling us to do it in fewer steps.

Reduction: G&J use Vertex Cover.  So we’re given a graph G=(V,E) and an integer K.  Our set A will be the set of all vertices in V, plus a new element a0.  For each edge (u,v) in E, we will create a set {a0, u, v} in C.  Our bound J will be set to K + |E|.

Suppose G has a vertex cover of size K (we’ll call them v1 .. vk).  Then we know that each edge in E can be built using at least one of the VC vertices.  So the following sequence will work and be of size J:

• z1 through zk are formed by unioning a0 with the VC vertex vi (i=1..k)
• Each of the remaining unions will make an element of C (based on an edge in E) by unioning the z set that has the VC vertex in the edge with the vertex in A that has the other end of the vertex.

In the other direction, suppose we have a sequence of unions z1..zj that solves the Ensemble Computation problem.  There may be multiple possible solutions to this problem, let’s look at the shortest sequence, and also the one that has the least number of operations that make a z set out of 2 elements of V (and not a0)  It turns out that since we have a minimum length sequence, the only reason to union 2 elements u and v from V together would be because (u,v) is an edge in E, and we want to union in a0 later to make the set {a0, u, v} in C.  But then we can rearrange the order and union a0 with u first, and union that result with v to get the same element of C in the same number of steps.  So this means that our sequence has 2 kinds of operations:

1. Unions of some vertex with a0
2. Unions of one of the above sets with a vertex to make an element in C.

We know that there are |E| of the second kind of union (because there are |E| elements in C), and so there must be K of the first kind of union.  So the vertices we union directly with a0 must be of size K, and since one of them gets combined in each set in C (and thus is a part of each edge in E), must be a vertex cover.

Difficulty: 6. This is one of those “the description is harder than the actual problem” problems.  All of the talk of sequences and unions and things looks scary, but the actual problem isn’t that bad.  As for the actual reduction, I think it would be a good hard homework problem.  One trap I see is that the need for a0 might not be obvious (it’s there to force us to “choose” the VC vertices and commit them to be a step in our sequence).  So maybe that needs to be a hint.