Sometimes you need a nudge to see the right way to do a reduction. The reduction to this problem is based on a reduction for a similar problem, which encouraged me to look at the problem in a way that I probably should have noticed myself.

**The problem: **Expected Component Sum. This is problem SP18 in the appendix.

**The description: **Given a collection V of m-dimensional vectors, where each entry in each vector is a non-negative integer. We’re also given positive integers K and B. Can we partition V into K disjoint sets V_{1} through V_{K} such that:

- For each V
_{i}, we look at each position in each vector (from 1 to m), and we sum up the elements in that position in that V_{i} - For each V
_{i}, we find the position with the largest sum - We sum together the largest position sums of each V
_{i}. Is that sum at least B?

**Example: ** Suppose we have 4 vectors, each with 5 elements.

- v
_{1}= (1,2,3,4,5) - v
_{2}= (9,2,4,6,8) - v
_{3}= (3,7,1,1,4) - v
_{4}= (2,9,10,3,11)

If K=2, then we can create:

- V
_{1}= v_{1}and v_{2}Column 5 has the highest sum (15) - V
_{2}= v_{3}and v_{4}. Column 2 has the highest sum (16)

The total of the sums from each element of the partition is 15+16 = 31.

**Reduction: **G&J say to use X3C, and also mention two important facts:

- The problem is still NP-Complete even if the elements in all vectors are 0 and 1. This implies to me that the vectors should be Boolean representations of participation in the sets.
- The problem is
*no longer*NP-Complete if we fix K. This implies that the K value we choose in the reduction needs to be based on the X3C input somehow, and can’t be a simple number like 2 or 3.

My first pass of working on the reduction had me create a vector for each set in C, with positions in the vector corresponding to elements in X. (Thus, each vector would have ones in exactly 3 positions and zeroes everywhere else.) My natural inclination was to set K to 2 (one set for the cover, one set for “everything else”- the sets in C that were not in the cover). But that ran afoul of the prohibition of a fixed value for K.

I toyed with the idea of inverting the sets but didn’t get very far. Then in some web-searching for inspiration, I found a paper by Roy, Lakshmanan, and Liu that worked on a similar problem. They call their version the “Perfect Expected Component Sum” problem and works similarly except they fix B to be equal to |C|, and want the final sum to be exactly equal to B, instead of at least B.

The key idea is to have one vector for each element in X, and to have the dimensionality of the vector to be the size of the sets in C. So each vector v_{i} has a 1 in position j if set C_{j} contains element i. Now each position in the vector (from 1 to |C|) has exactly three vectors with a 1 in that position (the three elements that make up the set corresponding to that position). We set K=q and B=3q.

If a C’ exists that is an exact cover of X, then C’ consists of {C’_{1} .. C’_{q}} – exactly q sets from C that contain each element in X exactly once. Then we can partition V into sets of 3 vectors that correspond to the three elements of each C’_{i}. So the first partition has the 3 element vectors that correspond to the 3 elements in C’_{1}, the second partition has the 3 element vectors that correspond to the 3 elements in C’_{2}, and so on down. Each partition will have one column that has a 1 in all 3 elements, and so the maximum sum of all columns will be 3. Since we have q different sets of vectors in our partition of V, and each contributes 3 to the sum, our total sum is 3q=B.

If we have a partition of vectors that sums to at least V, notice that no set in C has more than 3 elements, therefore no column sum of any set of vectors in our partition will sum to more than 3. Thus, the only way to reach a sum of at least B is to have a sum of 3 in each of the sets in the partition. Since there are 3q vectors in V, this can only be accomplished by having exactly 3 vectors in each set in the partition. These 3 sets must have at least one column that has 1’s in all entries in that column. This column tells us which set to choose for the cover of X.

**Difficulty: **5. I’m pretty embarrassed I didn’t come up with the idea of using elements as vectors and sets as boolean entries in the vector. It’s very similar to the graph theory reductions from 3SAT where we have vertices for edges and clauses, and an edge between a literal vertex and a clause vertex if the literal is in that clause. There’s a similar property there that each clause has degree 3, that you can exploit.