I’m finally back from my trip. Apologies for the delay.

**The problem: **Internal Macro Data Compression. This is problem SR23 in the appendix.

**The description: **Given an alphabet Σ, a string s in Σ*, a “pointer cost” h and a bound B, can we find a string C in the alphabet of Σ augmented with at most s “pointer characters” p_{i} such that:

- |C|+ (h-1)* (# of pointer characters in C) is ≤ B, and
- We can replace pointer characters with substrings of C to regain s?

**Example: **This is similar to the last macro compression problem, but instead of having 2 strings C and D, C serves both purposes. In our example from last time, we had s = “ABCDEFABCGABCDEF”, and came up with C= “pqpGpq” and D= “ABCDEF”. The difference here is that there is no D string, we have to do it all in C. We can do that by letting C = “ABCDEFqpGpq”. Now if we let p=”ABC” (or, “the substring of C of length 3 starting at position 1” and q = “DEF” (or, “the substring of C of length 3 starting at position 4”, we get a string of cost 11+3h that gives us back s if we replace all pointers with the strings they represent.

**Reduction: **The paper by Storer that we’ve been using for the last few problems also has this reduction. I think this is his “CPM” problem. He uses the “K-Vertex Cover” problem, in the case where K=1. Recall in that case we are looking for a set of edges E’ such that each edge in E shares a vertex with an edge in E’. We start with a graph, which is an instance of this problem. The alphabet we build will have:

- A special symbol $
- One symbol for each vertex and edge in G

For each edge e_{i} = (v_{j}, v_{k}), define the substring E_{i } = $v_{j}$v_{k}$

Our s = the concatenation of e_{i}E_{i} for all edges i (So the symbol for the edge followed by the substring of the edge). B = J + |S| – |E|.

The idea (and I’ll admit I have a hard time following this paper) is that if G has a 1-VC, then we have J edges that form a set E’ such that each edge in E is adjacent to something in E’. This means that we can overlap those edges and make pointers out of the E_{i} strings to save enough characters.

**Difficulty: **9. I’m sure this can be explained better, but I really have a hard time following it.