Tag Archives: Difficulty 7

Expected Retrieval Cost

Here’s another problem where the G&J definition confused me for a bit.

The problem: Expected Retrieval Cost.  This is problem SR4 in the appendix.

The description: Given a set R of records, each with a probability of being accessed between 0-1 (and the sum of all probabilities = 1), some number m of sectors to place records on, and a positive integer K.  Can we partition R into m disjoint subsets R1..Rm  such that:

  • The “latency” cost of 2 sectors i and j, called d(i,j) is j-i-1  (if i < j) or m-i+j-1 (if i >=j)
  • The probability of a sector, called  p(Ri), is the sum of the probabilities of the records on that sector
  • The sum over all pairs of sectors i and j is p(Ri) * p(Rj) * d(i,j) is K or less

Example: The thing that was the hardest for me to understand was the definition of d.  The way it’s written, the distance between 2 adjacent sectors (for example d(2,3)) is 0.  The distance between a sector and itself (for example d(2,2)) is m-1.  The paper by Cody and Coffman do a better job of explaining the motivation: What we’re looking at is the time (in sectors traversed) for a disk to read sector j after finishing reading sector i.  So If we read sector 2 right before reading sector 3, the disk has no traversal time to go from the end of sector 2 to the beginning of sector 3.  But if we read sector 2 twice in a row, the disk reader (in this model) needs to scan to the end of all of the sectors, then return to the beginning, then scan all the way to the beginning of sector 2 to read again.

So, suppose we have m=2, and 4 records, each with .25 probability.  If we put them all in the same sector, we have d(i,j) = 1 for all pairs of sectors.  Since all pairs of sectors are in (say) R1, then p(R1) = 1, and p(R2) = 0.  So our sum is:

  • p(R1)*p(R1)* d(1,1) = 1*1*1 = 1, plus
  • p(R1) * p(R2) * d(1,2) = 1*0*0 = 0, plus
  • p(R2) * p(R1)* d(2,1) = 0*1*0 = 0, plus
  • p(R2)* p(R2) * d(2,2) = 0*0*1

..for a total of 1.

If we put 2 records in sector 1, and 2 records in sector 2, then p(R1) = p(R2) = .5.  So our sum is:

  • p(R1)*p(R1)* d(1,1) = .5*.5*1 = .25, plus
  • p(R1) * p(R2) * d(1,2) = .5*.5*0 = 0, plus
  • p(R2) * p(R1)* d(2,1) = .5*1.5*0 = 0, plus
  • p(R2)* p(R2) * d(2,2) = .5*.5*1 = .25

..for a total of .5.

The reduction: Hopefully the example using m=2 helps to show why using Partition is a good choice.  So we start with a set S of elements.  We will turn each element S into a value between 0 and 1 reflecting its proportion of the sum of all of the elements.  For example, if S={1,2,3,4,5}, then we would create a set R of values {1/15, 2/15, 3/15, 4/15, 5/15}.  These probabilities will all be between 0 and 1 and will all sum to 1.

We will set m=2, K = 1/2. Notice that d(1,2) = d(2,1) = 0.  So the only d values that will count for our sum is d(1,1) and d(2,2) (which are both 1)  So by our formula we need p(R1) * p(R2) + p(R2) * p(R1) = .5.

Some algebra tells us that this means that p(R1)*p(R2) = ..25, and we know that p(R1) + p(R2) = 1.  Solving that system of equations gets us p(R1) = p(R2) = .5.  Or, we have an Expected Retrieval Cost solution for R exactly when we have  a partition of S.

Difficulty: 4. Cody and Coffman say the details of the above reduction are “routine” after defining k = 1/2.  It is pretty straightforward, but there are some tricky parts to worry about.

I will say though that the definition in G&J, where it’s not clear how distances to adjacent things can be 0, struck me as much harder, and is the reason I dug up the Cody and Coffman paper in the first place.  I’d say that definition makes the problem a 6 or 7.

Dynamic Storage Allocation

Since Bin Packing was a redo, here is the first real problem in the Storage and Retrieval section.

The problem: Dynamic Storage Allocation.  This is problem SR2 in the appendix.

The description: Given a set A of items.  Each item a in A has size s(a), arrival time r(a) and departure time d(a) (all positive integers).  We’re also given a storage size D.  Can we allocate the items to D “slots” of storage such that:

  • Each item is stored in consecutive slots.  So an element a has to be contained in s(a) adjacent locations from 1 to D.
  • No two items overlap the same slot during the time they are in storage. In other words, if two items a and a’ are mapped to the same slot in D, the must not have any overlap between their arrival and departure times.

Example: Here’s a simple set of items:

Item Number Arrival Departure Size
1 1 2 4
2 2 3 4
3 1 3 2

If D=6, we can store these items by using slots 1-4 to hold both items 1 and 2 (notice that they don’t overlap in time, and having one item arrive right as the other departs is ok), and slots 5-6 to hold item 3.

Reduction: The reference to Stockmeyer in G&J is to a private communication.  I tried working out my own reduction from 3-Partition, but couldn’t make it work.  My approach was to make the sizes of the elements in the 3-Parttion instance map to times in this problem, since G&J give the hint that you can make all sizes 1 or 2.  But I couldn’t figure out how to make it work.  I sort of expect there to be 3 possible sizes for a 3-partition problem, instead of 2.

Eventually, I found a paper by Lim that uses regular Partition, using the storage allocation problem as a special case of a problem involving berthing ships.   (The ship problem adds extra complications like each ship needing a specified clearance between it and other ships).  He starts with a set A of elements, and defines T to be the sum of all of the element sizes.  He then creates one item in the storage allocation problem for each element in S.  For a given s(a) in A, the new item has size s(a), arrival time 2, departure time 3 (so exist for just one time duration) .  He also adds 9 new items that have the effect of creating only two sequences of storage slots that can hold the items from s, each of size= T/2. We can place the items in these slots if and only if there is a partition of S.

Difficulty: 7.  I don’t think the idea is too hard to understand, but the 9 sets that are created are hard to come up with (even if you can understand what their purpose is, coming up with the sets that actually get that purpose accomplished is pretty hard).

Protected: Minimum Separating Set

This content is password protected. To view it please enter your password below:

Protected: Minimum Test Set

This content is password protected. To view it please enter your password below:

Protected: Intersection Graph For Segments On A Grid

This content is password protected. To view it please enter your password below:

Protected: Minimizing Dummy Activities in PERT Networks

This content is password protected. To view it please enter your password below:

Protected: Undirected Two-Commodity Integral Flow

This content is password protected. To view it please enter your password below:

Protected: Directed Two-Commodity Integral Flow

This content is password protected. To view it please enter your password below:

Protected: Capacitated Spanning Tree

This content is password protected. To view it please enter your password below:

Protected: Path Distinguishers

This content is password protected. To view it please enter your password below: