# Tag Archives: Difficulty 4

## Capacity Assignment

This problem is from the same “unpublished manuscript” as last week’s.

The problem: Capacity Assignment.  This is problem SR7 in the appendix.

The description: Given a set C of “communication links”, and set M of positive capacities.  Each pair of a link c and a capacity m also has a cost function g(c,m) and delay penalty d(c,m) that has the following properties:

• If i < j ∈ M, then g(c,i) ≤ g(c,j)
• If i < j ∈ M, then d(c,i) ≥ d(c,j)

We’re also given positive integers K and J.  The problem is: Can we assign a capacity to each link such that the total g cost of all of our assignments is ≤ K and the total d cost of all of our assignments is ≤ J?

Example: There’s a lot to parse in that problem description.  The first thing to notice is that the set of links C doesn’t necessarily have to link anything together (it’s not like it has to apply to an underlying graph).  So we can just give them names:

C={a,b,c,d,e}

Next, there is no reason why the set of capacities has to be assigned as a bijection to C- the set M could be a different size entirely than the size of C:

M={1,2}

The cost function has to have the property that if we assign a 2 to a link, it has to cost as least as much as assigning 1 to the link:

g(c,1) = 3 for all c

g(c,2) = 4 for all c

The delay function has to have the property that if we assign a higher capacity to a link, the delay can’t be larger than assigning a lower capacity:

d(c,1) = 6 for all c

d(c,2) = 5 for all c

In this case, if we assign the capacity of 1 to all links, we get a total cost of 15 and a total delay of 30.  If we assign the capacity of 2 to all links, we get a total cost of 20 and a total delay of 25.     If we have K = 18, and J = 27, we can achieve that by setting 2 links to have capacity 1 and 3 links to have capacity 2.

The reduction: The example above is pretty close to how the reduction will work.  We will reduce from Sum of Subsets, so we start with a set S of integers and a target B.   Our set C will have one element for each element in S.  Our set M will be {1,2}.  Assigning a capacity of 1 will imply we don’t want to take this element in S’, and assigning a capacity of 2 will imply that we do.  (This makes more sense if I can use the set {0,1} for M, but the problem description says the elements of M have to be positive)

We will define our g function so that g(c,1) = 1 for all c, and g(c,2) will be s(c)+1 (where s(c) is the size of the element in S that corresponds to c).

Our d function will work similarly:  d(c,1) = s(c)+1 for all c, and d(c,2) = 1 for all c.  These functions both follow the restrictions for how g and d work.

Set K = |S| + B.  Since each cost is either s(c)+1 or 1, this is saying that there needs to be enough elements assigned a 1 (such that its cost is 1, instead of s(c)+1) to that the sizes of those elements does not exceed K.

Let T = The sum of all of the sizes of all of the elements in S.  Then let J = |S| + T – B.  Again, each d value always includes 1, and may include s(c) as well.  So this is saying that there needs to be enough values assigned a 2 (so that its delay is 1) so that the sizes of those elements does not exceed J.

If S has a SOS solution S’, then assigning a capacity of 2 to all elements in S’ and a 1 to all elements in S’ gives us a cost value of exactly K, and a delay value of exactly J.

If we have a Capacity Assignment solution, then notice that K+J = 2|S|  + T, and so is the sum of all delays and capacities no matter what assignment is chosen.  (g(c,m) + d(c,m) = s(c)+2, for all c, no matter what m we use).  So if the sum of the delays (or costs) were strictly less than K, the sum of the costs (or delays) would have to be strictly more than J.  The only way to satisfy both the K and J constraints is to make the sums exactly equal, which gives us a SOS solution.

Difficulty: 4.  I think the algebra for this problem is a little easier than last week’s, but it does take some work to understand what the problem is asking.  Changing the problem slightly to allow assignments and costs and delays to be 0 instead of making them all be positive integers makes the reduction easier too.

## Expected Retrieval Cost

Here’s another problem where the G&J definition confused me for a bit.

The problem: Expected Retrieval Cost.  This is problem SR4 in the appendix.

The description: Given a set R of records, each with a probability of being accessed between 0-1 (and the sum of all probabilities = 1), some number m of sectors to place records on, and a positive integer K.  Can we partition R into m disjoint subsets R1..Rm  such that:

• The “latency” cost of 2 sectors i and j, called d(i,j) is j-i-1  (if i < j) or m-i+j-1 (if i >=j)
• The probability of a sector, called  p(Ri), is the sum of the probabilities of the records on that sector
• The sum over all pairs of sectors i and j is p(Ri) * p(Rj) * d(i,j) is K or less

Example: The thing that was the hardest for me to understand was the definition of d.  The way it’s written, the distance between 2 adjacent sectors (for example d(2,3)) is 0.  The distance between a sector and itself (for example d(2,2)) is m-1.  The paper by Cody and Coffman do a better job of explaining the motivation: What we’re looking at is the time (in sectors traversed) for a disk to read sector j after finishing reading sector i.  So If we read sector 2 right before reading sector 3, the disk has no traversal time to go from the end of sector 2 to the beginning of sector 3.  But if we read sector 2 twice in a row, the disk reader (in this model) needs to scan to the end of all of the sectors, then return to the beginning, then scan all the way to the beginning of sector 2 to read again.

So, suppose we have m=2, and 4 records, each with .25 probability.  If we put them all in the same sector, we have d(i,j) = 1 for all pairs of sectors.  Since all pairs of sectors are in (say) R1, then p(R1) = 1, and p(R2) = 0.  So our sum is:

• p(R1)*p(R1)* d(1,1) = 1*1*1 = 1, plus
• p(R1) * p(R2) * d(1,2) = 1*0*0 = 0, plus
• p(R2) * p(R1)* d(2,1) = 0*1*0 = 0, plus
• p(R2)* p(R2) * d(2,2) = 0*0*1

..for a total of 1.

If we put 2 records in sector 1, and 2 records in sector 2, then p(R1) = p(R2) = .5.  So our sum is:

• p(R1)*p(R1)* d(1,1) = .5*.5*1 = .25, plus
• p(R1) * p(R2) * d(1,2) = .5*.5*0 = 0, plus
• p(R2) * p(R1)* d(2,1) = .5*1.5*0 = 0, plus
• p(R2)* p(R2) * d(2,2) = .5*.5*1 = .25

..for a total of .5.

The reduction: Hopefully the example using m=2 helps to show why using Partition is a good choice.  So we start with a set S of elements.  We will turn each element S into a value between 0 and 1 reflecting its proportion of the sum of all of the elements.  For example, if S={1,2,3,4,5}, then we would create a set R of values {1/15, 2/15, 3/15, 4/15, 5/15}.  These probabilities will all be between 0 and 1 and will all sum to 1.

We will set m=2, K = 1/2. Notice that d(1,2) = d(2,1) = 0.  So the only d values that will count for our sum is d(1,1) and d(2,2) (which are both 1)  So by our formula we need p(R1) * p(R2) + p(R2) * p(R1) = .5.

Some algebra tells us that this means that p(R1)*p(R2) = ..25, and we know that p(R1) + p(R2) = 1.  Solving that system of equations gets us p(R1) = p(R2) = .5.  Or, we have an Expected Retrieval Cost solution for R exactly when we have  a partition of S.

Difficulty: 4. Cody and Coffman say the details of the above reduction are “routine” after defining k = 1/2.  It is pretty straightforward, but there are some tricky parts to worry about.

I will say though that the definition in G&J, where it’s not clear how distances to adjacent things can be 0, struck me as much harder, and is the reason I dug up the Cody and Coffman paper in the first place.  I’d say that definition makes the problem a 6 or 7.

## Ratio Clique

Last week it was pointed out to me that my reduction for Balanced Complete Bipartite Subgraph was wrong, and in my searches to fix it, I found that the real reduction (by Johnson) used a variant of Clique that said (without proof)) that Clique is NP-Complete even if K was fixed to be |V|/2.  I looked up the Clique problem in G&J, and they say in the comments that it is NP-Complete for K = any fixed ratio of V.

I thought this was a neat easy problem that fit in the 3-6 difficulty range I mentioned last week and decided it was worth a post.  But thinking about this brings up some subtle issues relating to ratios and constants that are common sources of errors among students.  I’ll talk about that at the end.

The problem: I don’t know if there is an official name, so I’m calling it “Ratio Clique”.  It is mentioned in the comments to GT19 (Clique).

The description: For any fixed number r, 0< r < 1, does G have a clique of size r*|V| or more?

Example:  Here’s a graph we’ve used for a previous problem:

If r = .5, then r*|V| = 3.5.  So we’re asking if a clique of 3.5 or more vertices exists (which really means a clique of 4 or more vertices).  It does not exist in this graph.  If r ≤ , then we would be looking for a clique of size 3, which does exist in this graph (vertices b, c, and t)

The reduction: We will be reducing from the regular Clique problem.  Since we want to show this “for any fixed value of r”, we can’t change r inside our reduction.

So we’re given a graph G=(V, E) and a K as our instance of Clique. We need to build a graph G’=(V’, E’) that has a fixed K’ = ⌈r*|V’|⌉.

G’ will start with G, and will add new vertices to the graph.  The vertices we add depend on the ratio s of K to |V|    (K = ⌈s*|V|⌉).  K’ is initially K, but may change as vertices are added to the graph.

If r > s, then we need to add vertices to V’ that will connect to each other vertex in V’, and will increase K’ by 1.  This increases the ratio of , and we keep adding vertices until that ratio is at least r.

If G has a clique of size K, then the extra vertices in K’ can be added to the clique to form a larger clique (since these new vertices connect to every other vertex)

If G’ has a clique of size K’, notice that it must contain at least K vertices that were initially in G. (We only added K’-K new vertices).  These vertices that exist in G are all connected to each other and so will form a clique in G.

If r < s, then we will add vertices to V’ that are isolated (have no edges connecting to them).  K’ will stay equal to K.  Each vertex we add will reduce the ratio of , and we keep adding vertices until  K=⌈r*|V’|⌉.

Since these new vertices can not be part of any clique in G’, any clique in G’ must consist only of vertices from G.  Since K=K’, this gives us a clique of size K in both graphs.

It is probably also worth mentioning just how many vertices need to get added to the graph in each case, to make sure that we are adding a polynomial number.  If r>s, we will be adding w vertices to satisfy the equation: ⌈s*|V|⌉ + w = ⌈r*(|V|+w)⌉

(These are both ways of expressing K’)

Dropping the ceiling function (since it only leads to a difference of at most one vertex) Solving for w gets us w = .  Since r > s, both sides of that division are negative, so w ends up being positive, and polynomial in |V|.

If r < s, we will be adding w vertices to satisfy the equation:

⌈s*|V|⌉ = ⌈r(|V|+w)⌉

(These are both ways of expressing K)

This can similarly be solved to w = s|V|-r|V|.  Since s > v, this is also a positive (and polynomial) number of new vertices.

A possible source of mistakes: I’m pretty sure this reduction works, but we need to be careful that there is a difference between “for any fixed ratio r of |V|” and “for any fixed K”.  Because for a fixed K (say, K=7) solving the “Does this graph have a 7-Clique?” problem can be solved in polynomial (by enumerating all subgraphs of size 7, for example.  There are subgraphs, which is O()).  By choosing a ratio instead of a constant K, we gain the ability to scale the size of K’ along with the size of the graph and avoid this issue.  But it is worth mentioning this to students as a possible pitfall.  It’s very easy to do things in a way that effectively is treating r|V| as a constant K, which won’t work.

Difficulty: 3, but if you’re going to make students to the algebra to show the number of vertices that are added, bump it up to a 4.

## Monotone 3-Satisfiability

I told Daniel when he gave me his Monotone Satisfiability reduction that the actual problem mentioned in G&J was Monotone 3-Satisfiability.  So he went off and did that reduction too.
The Problem:
Monotone 3 SAT. This is a more restrictive case of Monotone SAT

The Description:
Given an formula of clauses where each clause in contains all negated or non-negated variables, and each clause contains at most variables. Does there exist an assignment of the variables so that is satisfied?

Example:

the following assignment satisfies :

However:

And the following is in Monotone  3SAT form:

are both unsatisfiable.

The reduction:
In the following reduction we are given an instance of 3SAT,
. Here each clause is of the form:
where

and each is a literal of the form .
We use the following construction to build an instance of Monotone  3 SAT out of the above instance of 3SAT :
In each clause we have at most one literal, that is not of the same parity as the rest of the literals in the clause. For every such literal, we may preform the following substitution:
this yields a modified clause .
Now we must be able to guarantee that and are mapped to opposite truth values, so we introduce the new clause:
and conjunct it onto our old formula producing a new formula .

For example:
so we preform the substitution

so and

Now repeating this procedure will result in a new formula: .
We claim logical equivalence between the and This is semantically intuitive as the clause requires all substituted literal in to take the value opposite of this was the stipulation for the substitution initially. It is also verifiable by truth table construction for:

:
If there exists a truth assignment that satisfies , then we may extent this truth assignment to produce which will satisfy
by letting for all and letting for all .
Obviously if is satisfiable must be by the above construction of . So by the above claim we have that will satisfy .
:
Continuing from the above, if we have a truth assignment that satisfies , then by the claim above it also must satisfy . And is a sub-formula of so any truth assignment that satisfies must also satisfy .

(Back to me)

Difficulty: 4, since it’s a little harder than the regular Monotone Sat one.