Monthly Archives: July 2017

K-Vertex Cover

This week we do a variant of Vertex Cover that I don’t think I’ve seen before, but we’ll use for the next problem.

After this week, I’m going to be traveling for three weeks.  So the next post will be the week of August 14.

The problem: K-Vertex Cover.  This problem is not in the appendix.

The description: Given a graph G=(V,E), and integers K and J.  Does G contain a set of J or less paths, where each path contains K or less edges, and each edge in E is incident on at least one of these paths?

Example:  Here is a graph:

A K-Vertex Cover where K = 0 is just a regular Vertex Cover:

With K=1, this is “Find a subset E’ of E where each edge in E shares an endpoint with something in E'”:

With K=2, we are finding paths of 2 (or less) edges where each edge in E is adjacent to something in a path (I colored each path a different color to make them easier to tell apart):

Notice that K is set as a parameter of the problem.

Reduction: The report by Storer that had the previous reduction (and which will have the next one) introduces this problem.  The reduction is from Vertex Cover.  I guess we could just take the initial VC instance, set K=0, and be done, but I like this construction because it shows how to make 1-VC, 2-VC, and so on NP-Complete.  (In other words, if K is fixed outside of the problem, you need this reduction).  Interestingly, a |V|-vertex cover is trivial (it’s asking if the graph has J or less connected components).

Given our VC instance G=(V,E), with bound J (instead of K, just because K here means something different and the J for our problem will be the same J for the VC instance), build a new graph G’=(V’, E’):

V’ starts with V, and adds two new vertices for each K and each J (so 2*K*J new vertices) labeled x1,1 through xj,k and y1,1 through yj,k.  Notice that since J and K should be ≤ |V| (or the answer is trivial), this only adds a polynomial number of vertices.

E’ starts with E, adds an edge from each vertex in V to all x vertices labeled xj,1, an edge from each xi,j to its corresponding yi,j, and a vertex from xi,j to the next xi,k+1

The idea is that since each x vertex needs to connect to a y vertex, we will need to include a path that goes through each x vertex in our k-cover.  Since there are J paths of length K, and each vertex in v connected to all xj,1, what happens is that each of the J vertices chosen in the VC of G “chooses” a different path of x vertices.  So we can cover all of the vertices in G’ with J paths if and only if we can cover all of the vertices in G with J vertices.

Difficulty: 5.  This is a pretty cool reduction.  I like the idea of adding K *J copies of the vertices, because while you can do that for a graph problem (since K and J need to be < |V| realistically), you can’t do that for other kinds of problems.  For example, trying to create an element for each value from 1 to K in the Sum of Subsets problem won’t give you a polynomial reduction.

External Macro Data Compression

These next problems are related “macro compression” problems.

The problem: External Macro Data Compression.  This is problem SR22 in the appendix.

The description: Given a string s over some alphabet Σ, a “pointer cost” h and a bound B, can we find two strings D and C over the alphabet of Σ augmented with some number (< |s|) of “pointer characters” pi such that:

  • |D| + |C| + (h-1)* (# of p characters in D and C) ≤ B
  • We can generate s by replacing pointer characters in C with their “meaning” in D.

Example: The concept they’re trying to get at isn’t that hard, it’s just hard to explain using mathematical language.  Suppose s = “ABCDEFABCGABCDEF”

Then we can define D to be “ABCDEF” and C to be “pqpGpq”.  The total size of this is 6 for D, plus 6 for C.  There are 5 pointers, so our total cost is 6+6+5h.

The idea is that the characters p and q are “pointers” that refer to substrings in D (p refers to “ABC” and q refers to “DEF”).  By replacing those pointers with what they “mean” in C, we can get back s.

A tricky part of this is that you are allowed to have substrings overlap.  So if s was “ABCDBCD” we could define D to be “ABCD” and C to be “pq” with p meaning “ABCD” and q meaning “BCD”.  Now our total cost is 4 for D, 2 for C, and 2 pointers, so 4+2+h.

Reduction:  The reduction (and the one for SR23) comes from a technical report by Storer, which is pretty dense.  I think we’re looking at “Theorem 2” in the paper, which is from VC<=3.

The alphabet that will be built from the VC instance has a lot of parts:

  • A special symbol $
  • 3 symbols vi, ai, and bi for each vertex vi in the graph
  • 4 more symbols fi,1,1 through fi,2,2 for each vertex vi in the graph
  • one symbol di for each edge ej in the graph.
  • 2 symbols c1 and c2 (there is actually one c symbol for each value of h.  We’ll assume h=2 here)
  • 3 symbols g1,1 g1,2 and g2,1 (This is also based on h.  So really you’d go from g1,1 through gh-1,2 and add gh,1)

The string s will also be built from parts (a lot of these are based on h has well, again, I’m fixing h=2 to keep it simpler)

  • Vi,l = ai$vi
  • Vi,2 = vi$bi
  • For each edge ei= (vj, vk), Ei = $vj$vj$
  • We also have Z1 = (c1)3 (so 3 copies of c1) and Z2 = (c2)3

s is:

  • Eidi  concatenated for each edge, followed by
  • Vi,jfi,j,k concatenated over each vertex and each possible f symbol, followed by
  • Z1g1,1Z1g1,2, followed by
  • Z2g2,1Z2

K’ = |s| + K – (7/2)|V|.

The basic idea from here is that if G has a VC, we can compress s by making pointers for (among other things) the combination of Vi,1fi,1,2Vi,2 and the combination of Vi,2fi,2,2Vi+1,1  where vertex vi is in the VC.  This lets us use the pointers both for the part of the string with the V’s and f’s, but also for the $vi$ strings in the E components, saving space.

In the other direction, if we have a compressed string of size K’, he shows that means that you have compress a string like the above and the overlaps show you the vertices in the cover.

Difficulty: 8.  I think to really get what’s happening, you need to actually build an example on a graph.  But I think the idea of building the sets so that that is the overlap you’re looking for is something that can be gotten eventually.