Tag Archives: Difficulty 7

Randomization Test For Matched Pairs

This reduction gave me a lot of trouble and gave me a lot of holes to fill.  I think I filled them okay, but I’m getting the vibe that I’m missing something here- the reduction reads very differently from the way the problem is described.

The problem: Randomization Test For Matched Pairs.  This is problem MS10 in the appendix.

The description: Given a series of n pairs of integers (x1, y1) through (xn, yn), and a positive integer K.  Are there at least K subsets of the integers for which:

\sum_{i \in S} |x_i - y_i | \leq \sum_{y_i > x_i} (y_i - x_i)?

Example:  The paper by Shamos that has the reduction might have a better explanation of what we are doing: he defines zi = yi – xi, and T* to be the sum of all of the positive zi (the right side of the inequality above).  Then if we could choose some set of zi to be positive (by changing their sign), and call that sum T, are there K or more sets where T \leq T*?

So, suppose we have 4 pairs: (2,4), (1,6), (8,4), (3,2).  This would make our zi‘s {2,5,-4,-1}, and so T* would be 7.

There are 16 subsets of the above pairs (nothing seems to say that the empty set isn’t allowed).  Here are some whose T is \leq 7:

{(2,4), (1,6)}   (T=7)

{(2,4), (8,4)} (T=6)

{(2,4), (8,4), (3,2)} (T=7)

{(3,2)} (T=1)

..and so on.  A set that does not work is {(1,6), (8,4)} since its T is 9.

Reduction: Shamos uses Partition, so we are given a set of N elements.  The total sum of the elements is S, so we want two subsets of size S/2. He wants to “Perform the randomization test on the numbers” in the partition set, which I don’t get because the test needs to be done on pairs.  He also wants T* to be S/2.  The paper seems to say that you can just set that, but it has to be based on the zi.  So I came up with a way to make that happen:

For each item xi, create a pair  (xi, 0).  The zi for each of these elements will be negative.  Then add one extra pair (0. S/2).  Since this is the only positive z value, T* will be this value.  I don’t know if this is what the paper wants me to do, and I’m a little worried that adding an extra element will throw off what is next.

The paper then claims that if there is no partition of the set elements into equal-weight subsets, there are 2N-1 subsets with a T < T*.  This is because if there is no partition of equal size, then each subset of our elements either has a sum < S/2 (and this a T < T*), or its complement does.

If there is a partition, then 2 sets will have a T value of exactly T* (the two partition subsets) and half of the remaining subsets will have a T < T* as above.  So if we set K (the number of subsets whose T needs to be <= T*) to be 2N-1+2, we will have our reduction.

Difficulty: 7. I spent a long time trying to read and understand this reduction.  It’s very sparse, and, really, doesn’t explain at all how to make the pairs.  As a result, I’m pretty sure I filled in all of the holes, but it’s very possible that I’m missing something.