After this week, I’m going to be traveling for three weeks. So the next post will be the week of August 14.

**The problem: **K-Vertex Cover. This problem is not in the appendix.

**The description: **Given a graph G=(V,E), and integers K and J. Does G contain a set of J or less paths, where each path contains K or less edges, and each edge in E is incident on at least one of these paths?

**Example: **Here is a graph:

A K-Vertex Cover where K = 0 is just a regular Vertex Cover:

With K=1, this is “Find a subset E’ of E where each edge in E shares an endpoint with something in E'”:

With K=2, we are finding paths of 2 (or less) edges where each edge in E is adjacent to something in a path (I colored each path a different color to make them easier to tell apart):

Notice that K is set as a parameter of the problem.

**Reduction: **The report by Storer that had the previous reduction (and which will have the next one) introduces this problem. The reduction is from Vertex Cover. I guess we could just take the initial VC instance, set K=0, and be done, but I like this construction because it shows how to make 1-VC, 2-VC, and so on NP-Complete. (In other words, if K is fixed outside of the problem, you need this reduction). Interestingly, a |V|-vertex cover is trivial (it’s asking if the graph has J or less connected components).

Given our VC instance G=(V,E), with bound J (instead of K, just because K here means something different and the J for our problem will be the same J for the VC instance), build a new graph G’=(V’, E’):

V’ starts with V, and adds two new vertices for each K and each J (so 2*K*J new vertices) labeled x_{1,1} through x_{j,k} and y_{1,1} through y_{j,k}. Notice that since J and K should be ≤ |V| (or the answer is trivial), this only adds a polynomial number of vertices.

E’ starts with E, adds an edge from each vertex in V to all x vertices labeled x_{j,1}, an edge from each x_{i,j} to its corresponding y_{i,j}, and a vertex from x_{i,j} to the next x_{i,k+1}

The idea is that since each x vertex needs to connect to a y vertex, we will need to include a path that goes through each x vertex in our k-cover. Since there are J paths of length K, and each vertex in v connected to all x_{j,1}, what happens is that each of the J vertices chosen in the VC of G “chooses” a different path of x vertices. So we can cover all of the vertices in G’ with J paths if and only if we can cover all of the vertices in G with J vertices.

**Difficulty: **5. This is a pretty cool reduction. I like the idea of adding K *J copies of the vertices, because while you can do that for a graph problem (since K and J need to be < |V| realistically), you can’t do that for other kinds of problems. For example, trying to create an element for each value from 1 to K in the Sum of Subsets problem won’t give you a polynomial reduction.

**The problem: **External Macro Data Compression. This is problem SR22 in the appendix.

**The description: **Given a string s over some alphabet Σ, a “pointer cost” h and a bound B, can we find two strings D and C over the alphabet of Σ augmented with some number (< |s|) of “pointer characters p_{i} such that:

- |D| + |C| + (h-1)* (# of p characters in D and C) ≤ B
- We can generate s by replacing pointer characters in C with their “meaning” in D.

**Example: ** The concept they’re trying to get at isn’t that hard, it’s just hard to explain using mathematical language. Suppose s = “ABCDEFABCGABCDEF”

Then we can define D to be “ABCDEF” and C to be “pqpGpq”. The total size of this is 6 for D, plus 6 for C. There are 5 pointers, so our total cost is 6+6+5h.

The idea is that the characters p and q are “pointers” that refer to substrings in D (p refers to “ABC” and q refers to “DEF”). By replacing those pointers with what they “mean” in C, we can get back s.

A tricky part of this is that you are allowed to have substrings overlap. So if s was “ABCDBCD” we could define D to be “ABCD” and C to be “pq” with p meaning “ABCD” and q meaning “BCD”. Now our total cost is 4 for D, 2 for C, and 2 pointers, so 4+2+h.

**Reduction**: The reduction (and the one for SR23) comes from a technical report by Storer, which is pretty dense. I think we’re looking at “Theorem 2” in the paper, which is from VC<=3.

The alphabet that will be built from the VC instance has a lot of parts:

- A special symbol $
- 3 symbols v
_{i}, a_{i}, and b_{i}for each vertex v_{i}in the graph - 4 more symbols f
_{i,1,1}through f_{i,2,2}for each vertex v_{i}in the graph - one symbol d
_{i}for each edge e_{j}in the graph. - 2 symbols c
_{1}and c_{2}(there is actually one c symbol for each value of h. We’ll assume h=2 here) - 3 symbols g
_{1,1}g_{1,2}and g_{2,1}(This is also based on h. So really you’d go from g_{1,1}through g_{h-1,2}and add g_{h,1})

The string s will also be built from parts (a lot of these are based on h has well, again, I’m fixing h=2 to keep it simpler)

- V
_{i,l}= a_{i}$v_{i} - V
_{i,2}= v_{i}$b_{i} - For each edge e
_{i}= (v_{j}, v_{k}), E_{i}= $v_{j}$v_{j}$ - We also have Z
_{1}= (c_{1})^{3}(so 3 copies of c_{1}) and Z_{2}= (c_{2})^{3}

s is:

- E
_{i}d_{i}concatenated for each edge, followed by - V
_{i,j}f_{i,j,k}concatenated over each vertex and each possible f symbol, followed by - Z
_{1}g_{1,1}Z_{1}g_{1,2}, followed by - Z
_{2}g_{2,1}Z_{2}

K’ = |s| + K – (7/2)|V|.

The basic idea from here is that if G has a VC, we can compress s by making pointers for (among other things) the combination of V_{i,1}f_{i,1,2}V_{i,2} and the combination of V_{i,2}f_{i,2,2}V_{i+1}_{,1} where vertex v_{i} is in the VC. This lets us use the pointers both for the part of the string with the V’s and f’s, but also for the $v_{i}$ strings in the E components, saving space.

In the other direction, if we have a compressed string of size K’, he shows that means that you have compress a string like the above and the overlaps show you the vertices in the cover.

**Difficulty: **_{8}. I think to really get what’s happening, you need to actually build an example on a graph. But I think the idea of building the sets so that that is the overlap you’re looking for is something that can be gotten eventually.

**The problem: **Grouping By Swapping. This is problem SR21 in the appendix.

**The description: **Given a string x over some finite alphabet, and an integer K, can we convert x into a string that groups all characters together in adjacent positions by using K or less swaps of adjacent symbols?

**Example: **Suppose x was abcabccba

Our goal is to create a string where all of the a’s are consecutive, all of the b’s are consecutive, and all of the c’s are consecutive. So for example, we could try to gather the b’s together in the middle.

- Start with abcabccba
- 2 swaps get us acabbccba
- 2 swaps get is acabbbcca
- Now we need to bring the c’s together. 4 swaps get is aabbbccca
- Now we need to bring the a’s together. Swapping from the right we can get aaabbbccc in 6 more swaps. This is a total of 14 swaps

A better method would be to try to bring the c’s together first:

- Start with abcabccba
- 2 swaps get us ababcccba
- 3 swaps get us to ababbccca
- 1 swap gets us to aabbbccca
- 6 swaps get us to aaabbbccc. This is a total of 12 swaps.

Note that there is no rule that the a’s have to come before the b’s or anything. If we could get a final string of, say, bbbaaaccc in less total swaps, that would be fine.

**Reduction**: I should start by noting how similar this problem is to String-To-String Correction, especially since that problem is also NP-Complete in the case where the only operation allowed is swapping (instead of swapping or deleting). I was hoping to find a good simple reduction between these two problems. But I couldn’t figure out a way to relate the fact that String-To-String correction gives you a destination string to use, but Grouping By Swapping allows you to come up with any string, as long as the characters are grouped together. Doing all possible permutations of the alphabet as possible destination strings in a kind of Turing Reduction is too many possibilities, and I couldn’t think of a way to change x to force the best solution to this problem to relate to the needed y in the String-To-String Correction problem.

So, instead, we’ll use the paper by Paterson and Razborov that we used last week for the Non-Minimal Feedback Arc Set reduction which reduces from that problem here. They actually show a stronger problem- given a string x, can we find a permutation of the alphabet that has less inversions (out of order characters) on x than there is now?

(Since an inversion is “fixed” by swapping two characters to put them in order, we can reduce from this problem to our grouping by swapping problem by setting K to be the number of inversions in x)

We’re given an instance of Non-Minimal Feedback Arc Set- A directed graph G=(V,A) and a set S of edges that forms a feedback arc set in G. Give each vertex a number so that if we have an edge (v_{i}, v_{j}) in A-S, then i < j .

The string that will be built is based on a palindrome. Palindromes have the property that no matter what permutation of the alphabet, there is always the same number of inversions. (The way this relates to the original grouping by swapping problem is that *all* permutations of the characters in the final grouped string will have the same number of swaps). Note that if we swap 2 characters in a palindrome in the “wrong way” (away from where they will eventually end up) we will add 2 to the number of inversions.

Suppose |A| was p, and edge a_{j} was (u,v). Define e_{j} to be “uv” and E_{j} to be “vu”. The string w = e_{1}e_{2}..e_{p}e_{p}..e_{2}e_{1} is not quite a palindrome because we’re using “uv” in both the first half and the second half. The palindrome we want is: e_{1}e_{2}..e_{p}E_{p}..E_{2}E_{1. }

That palindrome has the same number of swaps to get to any legal grouping of characters. Call that number of swaps q. Given a permutation π of the alphabet, we can use the ordering of the vertices to find out how many edges in the graph are out of order in this permutation. Call this number back(π).

So the total number of inversions of w given a permutation π is q-|A|+2*back(π). (The inversions of the palindrome, minus all of the edges in A that were e instead of E, plus the penalty of 2 for each edge that is “wrong” in the inversion we’re using)

So, the original graph has feedback arc set smaller than S if and only if we can find a permutation π where back(π) < |S|. This happens if and only if we can find a permutation with less inversions than the identity permutation (the one that we got from the ordering of the vertices based on S), which only happens if w is a solution to the modified Grouping by Swapping problem.

**Difficulty: **9. I’ll be honest, I don’t really but the end of the reduction. I don’t really see how they’re using the feedback set S anywhere except for the initial ordering. So I’m not following those last couple of “if and only if” steps Maybe I’m missing something.

I do like the idea that a palindrome has the same number of inversions, no matter what your final order is. I wish I could think of a way to use that in a reduction from String-To-String Correction that was easier.

]]>**The problem: **Non-Minimal Feedback Arc Set. This problem is not in the appendix, but is related to the regular Feedback Arc Set problem (GT8).

**The description: **Given a directed graph G, and a subset S of the edges in G such that each cycle in G contains at least one edge in S (so S is a feedback arc set). Can we find another feedback arc set S’ with less edges than S?

**Example: **The difference between this problem and the “regular” Feedback Arc Set problem is that in this problem we’re given a set and asking “can we do better”, and in the regular Feedback Arc set problem, we’re given a K and asking “can we do it in K or less”? Here is an example graph we used for the Feedback Arc Set reduction:

Suppose we had the set S of {(f,d), (d,e), (e,c), (b,c)}. This is a feedback arc set, I’ve drawn it in blue:

In this case, we can build an S’ that is smaller: {(c,a), (e,a), (d,e)}:

Notice that S’ does *not* have to be a subset of S. It just needs to have less edges.

**Reduction: **The reduction for this problem and next week’s is from a paper by Paterson and Razborov from 1991. (As an aside, while I know that 1991 is over 25 years ago, I spend so much time with papers from the 1970’s working on these problems, that 1991 feels “modern”).

It feels like there should be an easy reduction from regular Feedback Arc set to this problem, but I couldn’t think of one. To do it, you’d need to take the K from the feedback arc set and create an *actual* feedback arc set (to make your S). Even for something like a Turing reduction, the requirement that you actually need to create a feedback arc set (which itself is hard to do) makes it hard for me to see an easy reduction in that direction.

So instead, the paper uses 3SAT. Given a formula, for each variable u and clause c add 4 vertices s: A_{u,i}, B_{u,i}, a_{u,j}, and b_{u,j}. Within each quartet of vertices, add edges (a_{u,i}, b_{u,i}), (A_{u,i}, B_{u,i}), (b_{u,i}, A_{u,i}), and (B_{u,i}, a_{u,i}):

Notice that this forms a cycle (and remember, there is one of these cycles for each pair of vertices and clauses- even if the vertex doesn’t appear in that clause. Though to be honest, I’m not sure you need a copy if the vertex doesn’t appear in the clause). The edges (a,b) or (A,B) will be chosen as the feedback edges, with the lowercase letters meaning setting the literal positively in the clause, and the capital letters meaning setting the literal negatively.

We still need to add edges that tie variables that occur in multiple clauses together. Each clause has 3 edges that make a cycle out of the 3 literals in the clause. the edges go from b->a, using the lowercase version of the letter if the literal is positive and the uppercase version if the literal is negative. The example in the paper is that if Clause i = {x_{3}, ~x_{4}, x_{6}}, we’d add the three edges (b_{3,i}, A_{4,i}), (B_{4,i}, a_{6,i}), and (b_{6,i}, a_{3,i}). Here are the three variable cycles for clause i with the new edges added:

Notice that the way we set a variable to make a literal true (for example, making x_{3} positive by adding the edge (a_{3}, b_{3}) to the feedback arc set) creates a feedback arc set both for the 4-vertex cycles, but also for the cycle for the clause.

The authors say that is “straightforward” that 3SAT is NP-Complete even when you give an assignment of the variables that satisfies all but one of the clauses. Mapping this assignment to the feedback edges in the graph will give us a feedback set for each cycle of 4 vertices, and for all clause cycles except for one Thus, we need an edge to “satisfy” this last clause. Call this set of edges (The edges corresponding to the all-but-one clause assignment, plus one edge in the last remaining cycle) S.

If we can find an S’ with *less* edges than S, it must have a way to set each variable to satisfy each clause, and thus the original formula is satisfiable. If we can’t, then the formula is unsatisfiable.

**Difficulty: **8. I don’t think the “satisfy all but one clause” thing is straigthforward. I guess if you give that version of 3SAT to the students, you can sort of see what you need to do with the edges in the graph. But this way of looking at the problem took me a while to wrap my head around.

**The problem: **String-To-String Correction. This is problem SR20 in the appendix.

**The description: **Given two strings x, and y over a finite alphabet ∑, and an integer K. Can we start from x and derive y in K or fewer steps, where each step is either deleting a symbol or interchanging a symbol.

**Example: **The puzzles I’m thinking of are the ones that say “Can you go from HELLO to OLE in 6 steps?”

- HELLO
- ELLO
- ELO
- EOL
- OEL
- OLE

The reason this is a hard problem is when you have repeated symbols, and need to “choose” which ones to delete. A slightly harder example is to go from ABABABABBB to BAAB. Now which A’s and B’s you choose to delete have an effect on the total number of moves.

The paper by Wagner that has the reduction makes the point that if you replace “Delete” with “Insert”, you get basically the same problem but go from y to x instead of from x to y. The paper gives other variants that have polynomial solutions (allowing insertions and deletions and changes of a character, whether or not you allow interchange).

**Reduction: **Wagner uses Set Covering. So we’re given a collection of sets c_{1}..c_{n} all subsets of some set W, and an integer K.

Define t = |W| and r = t^{2}. Pick 3 symbols Q,R, and S that are not in W. The string x will be:

Q^{r}Rc_{i}Q^{r}S^{r+1}

..concatenated together each time for each c_{i}. (So the copy that has c_{1} is first, then another copy with c_{2}, and so on.

The string y will be n copies of the substring RQ^{r}, followed by w, followed by n copies of the substring S^{r+1}

The K’ for this problem is (K+1)*r-1 + 2t(r-1) + n(n-1)(r+1)^{2}/2 + d.

d is r*n+|c_{1}…c_{2}| – |W|.

From here, Wagner explains that the way the correction problem works is by deciding which copies of the R, Q^{r} and S^{r+1} sets in x match to the corresponding ones in y. By counting all of the operations that “have” to be done (by deleting or moving characters), he shows that the a correction problem that has weight of K’ has to cross over certain c_{i} sets to make a covering, and that there have to be K or less of those sets.

**Difficulty: 8**. The construction is pretty crazy. You can eventually see where the components come from, but I can’t imagine how he came up with these things (especially the x and y strings) in the first place.

**The problem: **Two-Dimensional Consecutive Sets. This is problem SR19 in the appendix.

**The description: **Given a finite alphabet ∑, and a collection C = {∑_{1}..∑_{n}} of subsets of ∑, can we partition ∑ into some some number of sets X_{1}..X_{k} such that:

- Each X
_{i}has at most one element in each set in C, - For each ∑
_{j }in C, there is an “index” l(j) (a number ≥ 0) where ∑_{j}is in X_{l(j) }∪ X_{l(j)+1 }∪ … ∪ X_{l(j)+m-1}, where m =|∑_{j}|

**Example: **The “index” stuff is basically there to say “you can make each set in C by taking one element from a consecutive X sets”. So, if C = {{a,b,c}, {b,e}, {b,c,e}, and {b,c} over the alphabet {a,b,c,d,e} then we can make X = {{a,e}, {b}, {c,d}}

Then we can get each set in C by taking one element from consecutive sets in X.

An example of a C that can’t be realized would be: {{a,b}, {b,c}, {b,d}, {a,b,c,d}}

The element b needs to be in a set X that is adjacency to the sets that contain a,c, and d. Since the set with b can only be adjacent to at most 2 other sets in X, that means that some of the elements in a,c, and d need to be in the same set. But the set {a,b,c,d} prevents that from happening.

**Reduction: **G&J’s reference is to a paper by Lipski, which uses Graph Coloring. So we’re given a graph and an integer K. We’re going to build an alphabet that has one symbol for each vertex in V, K+2 “b” elements (numbered from 0 to k+1), and K-2 “a” elements (numbered from 1 to k-2) for each edge. (There is only one copy of the “b” elements. There is one set of the “a” elements for each edge). The sets in C are:

- One set containing all of the “b” elements.
- One set for each consecutive pair of “b” elements (so {b
_{0}, b_{1}} all the way through {b_{k}, b_{k+1}}) - For each edge (v,w), a set containing the vertex symbols for v and w, all k-2 “a” symbols for that edge, b
_{0}, and b_{k+1}

If G has a K-Coloring, we can split ∑ into k+2 pieces, numbered X_{0} to X_{k+1} (Notice that because of the set with all of the “b” elements in C, we need to have at least that many sets in X). Each X_{i} gets its corresponding “b” element. For each edge (v,w), we know that vertices v and w each have different colors (numbered from 1-K). We put v’s symbol in the X set numbered with v’s color, and we put w’s symbol in the X set numbered with w’s color. We put the remaining “a” symbols for that edge in the other k-2 X sets between 1 and N. This creates a two-dimensional partition of all of the sets in C.

If we have a two-dimensional partition of C, then list them out in order of how they sort the “b” elements (So the set with b_{0} is X_{0} and so on). It’s possible that we have to do a reverse ordering (the set with b_{0} is X_{k+1} and the set with b_{k+1} is X_{0}), but we can treat that as basically the same. Since the sets in C that were based on edges contain b_{0} and b_{k+1}, the other elements in that set need to fit in sets X_{1} through X_{k}. So the partition has to have exactly K+2 sets, and the vertices all fall in sets X_{1} through X_{k} with no two vertices sharing an edge in the same set. This gives a K-coloring of the vertices in the graph.

**Difficulty: **7. It sort of makes sense that the “two-dimensional” problem needs us to create a “two-dimensional” set of elements (the “a” elements), but I don’t think I would expect a student to be able to come up with these elements and these steps without a lot of help.

**The problem: **Consecutive Sets. This is problem SR18 in the appendix.

**The description: **Given a collection C of subsets of a finite alphabet Σ, and a positive integer K, can I find a string w from ∑* with K symbols or less, such that each element in C has its characters occur sequentially in w?

**Example: **Let ∑ = {a,b,c,d,e} and C = {{a,b,c}, {a,d,e}, {b,e}, {c,d}}

Then I can choose w to be dcbadeb, which is length 7, which is the best I could find. Notice how positions 2-4 contain {a,b,c}, positions 4-6 contain {a,d,e}, positions 6-7 contain {b,e}, and positions 1-2 contain {c,d}.

**Reduction: **This is “Problem 4” in the paper by Kou from last week. He reduces from Hamiltonian Path. So we start with a graph G=(V,E). Our alphabet Σ (he uses “R” in the paper) has one symbol for each vertex in V and one symbol for each edge in E. Our C set (he uses F in the paper) will have one set for each vertex. The symbols in the set for a vertex i will consist of the edge symbols for the edges incident on i, and the vertex symbol for i itself. K will be 2*|E|+1.

Notice that no two sets in C are subsets of each other (because they each hold a separate symbol from V), and that the intersection of 2 sets from C is a single edge (the edge connecting the two vertices corresponding to the sets), if that edge exists, and the intersection is empty otherwise. A string w that we make will be based on a sequence of vertices (and their C sets). Since each edge appears in two C sets, the “initial” size of w is 2|E| + |V|, if we have no overlap between C sets in w. The question is how small can it get?

Since each C set can have at most one symbol of overlap with another C set, and that only happens if an edge exists between the vertices that define the C sets, the only way to create an overlap to reduce the size of w is to order the vertices such that there are edges between consecutive pairs of vertices. G has a Hamiltonian Path if and only if we can order the vertices in such a way that there is an edge bewteen each part of consecutive vertices. In w, that means we can overlap 1 symbol from each pair of consecutive C sets. Since there are |V|-1 pairs of vertices in a path, the minimum size of w is 2|E| + |V| – (|V|-1) = 2|E| + 1 = K.

**Difficulty: **5. I think this is a neat reduction that students can get, though it might take a bit of work to make sure they understand the problem.

**The problem: **Consecutive Block Minimization. This is problem SR17 in the appendix.

**The description: **Given an m x n matrix A consisting of 0’s and 1’s and a positive integer K, can we permute the columns of A so that the permuted matrix (B) has K or less blocks of consecutive 1’s. G&J define a “block” as an entry b_{ij} =1 where b_{i,j+1} = 0 or j = n. So a “block” is one-dimensional, going across a row.

**Example: **Here’s a matrix:

1 | 0 | 1 | 0 | 1 |

1 | 1 | 1 | 0 | 1 |

0 | 0 | 1 | 1 | 0 |

0 | 1 | 1 | 0 | 0 |

1 | 1 | 0 | 0 | 1 |

Right now, this has 9 blocks (3 on the first row, 2 on the second and fifth, 1 on the third and fourth).

But we can permute the columns to get:

0 | 1 | 1 | 1 | 0 |

1 | 1 | 1 | 1 | 0 |

0 | 0 | 0 | 1 | 1 |

1 | 0 | 0 | 1 | 0 |

1 | 1 | 1 | 0 | 0 |

..for a total of 6 blocks.

**Reduction: **The paper by Kou has several reductions in it, we’ll be using this paper next week too. This problem is “Problem 3” in the paper. The problem uses Hamiltonian Path, but with costs for edges, so it’s really more like “Traveling Salesman on Paths”. His “Problem 2” takes an instance of regular Hamiltonian Path and shows that this “Traveling Salesman on Paths” problem is NP-Complete on “cost graphs”.

Kou defines a “cost graph” of a matrix as a complete graph that has one vertex for each row and an edge cost of edge (i,j) as the number of columns in A that have a 1 in row i and a 0 in row j. He proves in the paper that a Hamiltonian Path on the cost graph has cost k iff there are k+c blocks of 1’s in the matrix, where c is the number of 1’s in the last row in the matrix.

So, the full reduction takes an instance of HP, creates an incidence matrix (as we defined in the Consecutive Ones Matrix Augmentation problem), builds a cost graph of that, and uses that cost graph as the input to the Consecutive Block minimization problem.

**Difficulty: **7. The cost graph construction is pretty hard to grasp, and you need to prove some extra lemmas along the way to show that the transformation works.

**The problem: **Consecutive Ones Matrix Partition. This is problem SR15 in the appendix.

**The description: ** Given an mxn matrix A consisting of 0’s and 1’s, can we partition the rows of A into two groups m_{1} and m_{2} such that the matrices A_{1} = m_{1}x n and A_{2} = m_{2}x n both have the consecutive ones property?

**Example: **Here’s the matrix that does not have the consecutive ones property.

0 | 1 | 1 | 1 | 1 |

1 | 1 | 1 | 0 | 1 |

0 | 0 | 1 | 1 | 0 |

0 | 1 | 1 | 0 | 0 |

1 | 1 | 0 | 0 | 0 |

But if we break it into two matrices, for example:

0 | 1 | 1 | 1 | 1 |

1 | 1 | 1 | 0 | 1 |

0 | 1 | 1 | 0 | 0 |

and

0 | 0 | 1 | 1 | 0 |

1 | 1 | 0 | 0 | 0 |

The two matrices have the consecutive ones property (the second matrix already has it, the top matrix has it if you exchange the 4th and 5th columns.)

**Reduction: **G&J point to a “private communication”, but I found a technical report by Lipski that explains the reduction. The reduction is from Hamiltonian Path on cubic graphs (each vertex has degree at most 3). So we start with a graph G=(V,E) where each vertex is degree 3 or less. We construct a new graph G’=(V’,E’) where G’ adds new vertices and edges so that each vertex in V has degree 4. Each new vertex has degree 1 (and so the new edges we add to V always connect to distinct new vertices). Take the incidence matrix of this graph (as defined in the Consecutive Ones Submatrix problem) and use it as A.

If G has a Hamiltonian Path, then we can split A into two parts: the vertices in the path (thus, all of V) and everything else. The Hamiltonian Path has the consecutive ones property, just like we saw in the Consecutive Ones Submatrix problem. The other rows just map to disjoint paths of length 2 (two degree one edges connected by some vertex in V) so will be able to be rearranged.

If A has a partition that works, then choose some vertex v in V. Suppose it ends up in A_{1}. We know that this vertex has degree 4. We also know that to satisfy the consecutive ones property, v needs to be adjacent to exactly 2 things in A_{1} and 2 things in A_{2} for the consecutive ones property to work out. The vertices in A_{1} need to form a chain of vertices all in V to make the consecutive ones property hold, so will form a Hamiltonian Path.

**Difficulty: **6. The paper glosses over a lot of detail, and I did to. For example, the paper says “Without loss of generality we may assume that V ∈ M_{1}” (the paper’s version of A_{1}. I don’t think it’s obvious at all that the entirety of V will end up all in one partition or the other. I think it’s true, but I think it needs more work from a student to explain why it works.

**The problem: **Consecutive Ones Matrix Augmentation. This is problem SR16 in the appendix.

**The description: **Given an mxn matrix of 0’s and 1’s and a positive integer K, can we change K or fewer 0’s into 1’s to give the matrix the consecutive ones property. See the Consecutive Ones Submatrix problem for more detail on this property.

**Example: **Here is the matrix that did not have the property from last week:

0 | 1 | 1 | 1 | 1 |

1 | 1 | 1 | 0 | 1 |

0 | 0 | 1 | 1 | 0 |

0 | 1 | 1 | 0 | 0 |

1 | 1 | 0 | 0 | 0 |

Recall that we’re looking for a way to permute the columns so that there are no gaps in the 1’s in each row. The easy solution here is to add a 1 in the second row (making it all 1’s). In that case, the matrix has the consecutive ones property without needing any column permutations.

**Reduction: **Again, it is found in the dissertation by Booth, but this time uses Optimal Linear Arrangement. Recall in that problem we were given a graph and needed to find an ordering of vertices to minimize the “score” of the differences of the ordering values.

The corresponding idea here is to start with the graph and build an incidence matrix, just like in the Consecutive Ones Submatrix problem. We notice that any permutation of the rows (in the dissertation, you permute the rows to find consecutive ones in each column) corresponds to a permutation of the vertices in the OLA problem. If we took a permutation of the vertices, each edge (column) of the incidence matrix has exactly two 1’s in it. There may or may not be any 0’s between those ones, but if there are, we can convert those 0’s into 1’s (this is the augmentation).

The way these problems relate is that if an edge exists between two vertices, v_{i} and v_{j}, located at positions i and j in the final permutation, that edge adds j-i to the cost (assuming j > i). In our matrix, if we permute the row corresponding to vertex v_{i} to row i, and permute the row corresponding to vertex v_{j} to row j, then there are j-i-1 o’s between the two rows. So we could set them all to 1, and that column would have the consecutive ones property.

Thus, a solution of cost K to the OLA problem corresponds to a solution to the Consecutive Ones Matrix Augmentation problem of cost K-|E|, since there are |E| columns in the incidence matrix, and each one differs by 1 from the OLA cost of the edge.

**Difficulty**: 6, mainly because of the incidence matrix/adjacency matrix difficulty I talked about last time, but also because OLA is not easy to get. If you’ve already shown them incidence matrices (for example, because you’ve gone over the Consecutive Ones Submatrix problem), then this is probably a 4.