# Tag Archives: X3C

## Expected Component Sum

Sometimes you need a nudge to see the right way to do a reduction.  The reduction to this problem is based on a reduction for a similar problem, which encouraged me to look at the problem in a way that I probably should have noticed myself.

The problem: Expected Component Sum.  This is problem SP18 in the appendix.

The description: Given a collection V of m-dimensional vectors, where each entry in each vector is a non-negative integer.  We’re also given positive integers K and B.  Can we partition V into K disjoint sets V1 through VK such that:

• For each Vi, we look at each position in each vector (from 1 to m), and we sum up the elements in that position in that Vi
• For each Vi,  we find the position with the largest sum
• We sum together the largest position sums of each Vi.  Is that sum at least B?

Example: Suppose we have 4 vectors, each with 5 elements.

• v1 = (1,2,3,4,5)
• v2 = (9,2,4,6,8)
• v3 = (3,7,1,1,4)
• v4 = (2,9,10,3,11)

If K=2, then we can create:

• V1 = v1 and v2  Column 5 has the highest sum (15)
• V2 = v3 and v4.  Column 2 has the highest sum (16)

The total of the sums from each element of the partition is 15+16 = 31.

Reduction: G&J say to use X3C, and also mention two important facts:

1. The problem is still NP-Complete even if the elements in all vectors are 0 and 1.  This implies to me that the vectors should be Boolean representations of participation in the sets.
2. The problem is no longer NP-Complete if we fix K.  This implies that the K value we choose in the reduction needs to be based on the X3C input somehow, and can’t be a simple number like 2 or 3.

My first pass of working on the reduction had me create a vector for each set in C, with positions in the vector corresponding to elements in X.  (Thus, each vector would have ones in exactly 3 positions and zeroes everywhere else.)  My natural inclination was to set K to 2 (one set for the cover, one set for “everything else”- the sets in C that were not in the cover).  But that ran afoul of the prohibition of a fixed value for K.

I toyed with the idea of inverting the sets but didn’t get very far.  Then in some web-searching for inspiration, I found a paper by Roy, Lakshmanan, and Liu that worked on a similar problem.  They call their version the “Perfect Expected Component Sum” problem and works similarly except they fix B to be equal to |C|, and want the final sum to be exactly equal to B, instead of at least B.

The key idea is to have one vector for each element in X, and to have the dimensionality of the vector to be the size of the sets in C.  So each vector vi has a 1 in position j if set Cj contains element i.  Now each position in the vector (from 1 to |C|)  has exactly three vectors with a 1 in that position (the three elements that make up the set corresponding to that position).  We set K=q and B=3q.

If a C’ exists that is an exact cover of X, then C’ consists of {C’1 .. C’q} – exactly q sets from C that contain each element in X exactly once.  Then we can partition V into sets of 3 vectors that correspond to the three elements of each C’i.  So the first partition has the 3 element vectors that correspond to the 3 elements in C’1, the second partition has the 3 element vectors that correspond to the 3 elements in C’2, and so on down.  Each partition will have one column that has a 1 in all 3 elements, and so the maximum sum of all columns will be 3.  Since we have q different sets of vectors in our partition of V, and each contributes 3 to the sum, our total sum is 3q=B.

If we have a partition of vectors that sums to at least V, notice that no set in C has more than 3 elements, therefore no column sum of any set of vectors in our partition will sum to more than 3.  Thus, the only way to reach a sum of at least B is to have a sum of 3 in each of the sets in the partition.  Since there are 3q vectors in V, this can only be accomplished by having exactly 3 vectors in each set in the partition.  These 3 sets must have at least one column that has 1’s in all entries in that column.  This column tells us which set to choose for the cover of X.

Difficulty: 5.  I’m pretty embarrassed I didn’t come up with the idea of using elements as vectors and sets as boolean entries in the vector.  It’s very similar to the graph theory reductions from 3SAT where we have vertices for edges and clauses, and an edge between a literal vertex and a clause vertex if the literal is in that clause.  There’s a similar property there that each clause has degree 3, that you can exploit.

## Subset Product

I’m setting this to post automatically on the 27th.  Hopefully, it posts correctly.

Sp12 is Partition

Sp13 is Subset Sum

This next problem is related to those, but has a cool twist.

The problem: Subset Product.  This is problem SP14 in the appendix.

The description: Given a set A of positive integers, and a positive integer B, is there a subset A’ of A such that the product of the sizes all elements in A’ is exactly B?

(The G&J definition of the problem defines A as a set of generic elements, each with a positive integer “size”.  This is more general in that it allows for two different elements in A to have the same size.  But most of the time this and similar problems (for example: Subset Sum, Partition) are encountered, it is with the definition above)

Example: Let A = {1,2,3,4,5,6} .  If B = 60, then setting A’ to {2,4,5} solves the problem.  If B=61, then no subset of A will multiply to B.  61 is easy to see since it’s prime, but other non-prime numbers (like 35) also will not have a solution.

The reduction: G&J say to use X3C, and I’ll admit that this idea came to me while I was wrestling with the reductions for Comparative Containment and its relatives, with all of their work creating sets based on prime numbers.

We start with an instance of X3C: a set X with 3q elements, and a collection of 3-element subsets of X.

What we’re going to do is assign a prime number to each element in X- so the first 3q prime numbers will be allocated.

Each element in C will be represented by an element whose size is the product of the 3 prime numbers corresponding to the elements in C.   Notice that since each of the elements in X are represented by distinct prime numbers, the only way for two elements in C to generate the same number is if the two elements in C were exactly the same.  (In which case, the duplicate can be safely removed).

Our set A will be the collection of these C numbers, and our integer B will be the product of the first 3q primes.

So, if a cover C’ exists, multiplying all of the elements in C’ together will give us a number that is the product of the first 3q primes, because each element x in X will appear as a factor in some c in C’ exactly once, and thus each x’s prime number assignment will appear exactly once in the product of all of the numbers that represent the sets in C’.

If a subset A’ of A that multiplies to B exists, then the prime factorization of B gets us each of the first 3q prime numbers exactly once.  Each of the elements in A’ corresponds to a set in C, and the prime factorization of that element in A’ will “cover” three elements in X.  The union of all such coverings will cover X entirely.

The only hard part that remains is to decide whether we can actually find the first 3q prime numbers in polynomial time.  The prime number theorem says that the nth prime number is proportional to n log n, and brute-forcing whether a number is prime is O().  Thus, we should be able to find the first 3q prime numbers in polynomial time just by checking each number individually, starting from 2.  Obviously, more efficient methods also exist.

Difficulty: 4.  It’s easy to go off in very different directions (for example, trying to compare this problem to Sum of Subsets by realizing that taking the logarithm of a product gets you a sum). Also the prime number theorem stuff isn’t obvious, and I don’t know how you mention its existence to students without spoiling the entire reduction.

Still, once they have that possibly obscure bit of knowledge, this is a good easy reduction that many students can follow.