Tag Archives: Difficulty 7

Resource Constrained Scheduling

I spend a bit of time trying to see if there was an easy reduction between this problem and one of the other scheduling ones, but I couldn’t find anything.  So we’ll do it the hard way.

The problem: Given a set T of tasks, each with a length of 1, a set of m processors, and set of r resources.  Each resource i has a “bound” bi of how many resources are available at any time and each task t has a requirement ri(t) denoting how many of resource i it needs.  We’re also given a deadline D.

Can we find an m-processor schedule for all tasks that:

1) Meets the deadline, and

2) For all time units, ensures that all of the processors running in that time unit do not exceed the resource bounds for any resource?

Example: Suppose I have 4 tasks and 1 resource:

  • Task 1 uses 2 units of the resource
  • Task 2 uses 3 units of the resource
  • Task 3 uses 4 units of the resource
  • Task 4 uses 5 units of the resource

If there are 5 units of the resource available, we can get this task done in 3 time steps (Task 4, then 1 and 2, then 3)

If there are 7 units available, we can get the task done in 2 time steps (1 and 4, then 2 and 3)

If there are 14 units available, we can do everything in one time step.

Reduction: The paper by Garey and Johnson proves several variants of the problem,  The one I’ll do is in Lemma 3.4, which has 5 processors and 8 resources.  They go on from there to show the variant with 3 processors and 1 resource is also NP-Complete, but we only care about the general problem.

The reduction is from 3DM, In this case, we’re assuming that W, X, and Y (the 3 sets in the 3DM problem) are all the same set T, and S is our set of triples from T.  Define a value mk(i) to be the number of triples in S that have i in their kth value.  So m2(4) is the number of triples with a 4 in position 2.  Notice that for a legal matching to be possible, mk(i) has to always be at least 1.

So we build an instance of the resource constrained scheduling problem with 5 processors and 8 types of resources.   The tasks are:

  • One “S” task for each triple in S.
  • X0, Yo and Z0 tasks for each element in q (the number of elements in T)
  • Xij tasks where i runs from 1 to q and j runs from 1 to m1(i).
  • Similar Yij (using m2(i)) and Zij (using m3(i)) tasks.
  • “F” tasks from 1 to q.
  • |S|-q “G” tasks.

The deadline is |S|.

The resource requirements are set up so that during each time unit, we have 5 processors running:

  1. An S task
  2. An X or X0
  3. A Y or Y0
  4. A Z or Z0
  5. An F or G

The X0/Y0/Z0 tasks all correspond to an element in T, and can only be run along with an F and the S task that is the triple of the X, Y, and Z elements in S.   The F task makes sure that we have chosen a legal matching.   The matching consists of the S triples chosen along with an F task.  (The G tasks are there to fill the rest of the timesteps) .

Difficulty: 7.  More if you keep going and refine the problem down to 3 processors and 1 resource.

Single Execution Time Scheduling With Variable Number of Processors

Sorry for skipping last week.  I got about halfway through the post for the next problem, but forgot that the reduction went through this other problem instead, and then ran out of time to fix it.

The problem: Single Execution Time Scheduling.  This problem is not in the appendix and is problem “P4” in the paper by Ullman that has this reduction and the problem next week.

The description: Given a set S of N jobs, arranged in a partial order, each taking 1 time unit, a deadline D, and a sequence of D “processor slots” arranged in a sequence from 0 to D-1, where the sum of all of the processor slots is exactly N, can we create a feasible schedule for all tasks that respects the partial order?

Example: Here’s a simple example of a partial order:

If D=3 and the processor slot sequence was {1,2,1}, then this can be solved: schedule a at time 0, b and c at time 1, and d at time 2.   (So d is done by time 3).

If the processor slot sequence was {1.1.2}, then at time 1, we can only schedule 1 of b and c, so we won’t be able to schedule d at time 2.

Reduction: The Ullman paper goes from 3SAT.  The original formula will have m variables and n clauses.  We will create jobs:

  • xij and ~xij where i goes from 1 to m and j goes from 0 to m.
  • yi and ~yi where i goes from 1 to m
  • Dij where i goes from 1 to n and j goes from 1 to 7.

The partial order on these jobs is:

  • Each xij comes before xi,j+1 and each ~xij comes before ~xi,j+1
  • Each xi,i-1 comes before yi and each ~xi,i-1 comes before ~yi
  • For each Dij represent the j index as a binary number (from 1 to 7).  The three literals in clause i also have 7 combinations of ways to set their literals to make the clause true, which can also be represented as a binary number.  Then for each binary combination, look at the positive or negative settings of the literals that make that combination true.  Then take the last of the x (or ~x) jobs of the variables corresponding to those literals and make it come before the Di job.  So if we are considering job xk, we’d make xkm come before the Di job we’re looking at.

That last thing is just a way of saying “there are 7 ways of making the clause true, you need to execute the job of the literals that makes the clause true before you do the clause job.”

The deadline is n+3.  The processor slots have:

  • m slots at time 0
  • 2m+1 slots at time 1
  • 2m+2 slots at times 2-m
  • n+m+1 slots at time m+1
  • 6n slots at timem+2

The idea is that at time 0 we need to run one of either xi0 or ~xi0  for each i.  (The other is run at time 1).  These will correspond to whether variable i is set t true or false.  We need to do that because we need to run the y jobs as soon as they become available (y0 or ~y0 – whichever is the same parity as the x variable we ran in time 1- needs to be run at time 1, and so on down).  At time 1, we run either xi1 of ~xi1, depending on what we do at time 0.  So at time m+1, we have one y job left over (the last of the y’s in the sequence we started late), m x jobs left over (the xim or ~xim corresponding to the variable we started at time 1), and hopefully have enough x jobs finished to be able to run n D jobs (one for each clause).  This is the way you’ll satisfy each clause.  Then at time m+2, everything is done except for the other 6n D jobs.

Difficulty: 7.  I think Ullman does a very good job of explaining his method, which actually obscures a bit how complex this reduction is, and all of the moving parts and algebra going on.


Sequencing With Deadlines and Setup Times

Is it weird that G&J call these problems “sequencing” and not “scheduling”?  They use “scheduling” for multiprocessor problems, and shop scheduling problems.  I guess the thinking is that single-processor problems are just “sequences” of what you put on the processor.

The problem: Sequencing with Deadlines and Setup Times.  This is problem SS6 in the appendix.

The description: We’re given a set of T tasks, each task with a length l(t) and deadline d(t).  We also have a set of “compilers”, C, and each task has a specific compiler k(t).  Each compiler has a setup time l(c).  Can we find a one-processor schedule for T that meets the deadlines of all tasks with the additional constraint that if two tasks are scheduled with different compilers, we must pay the setup cost of the second task’s compiler before starting the second task?

Example: I’m not sure “compiler” is the best word for C.  I think of it as more of a preprocessor (and in fact, the paper by Bruno and Downey that has the reduction calls it a “setup task”).  Each task needs some preprocessing to be done before it can run, and if you run two tasks that need the same preprocessing in a row, you can do them both one after the other. Otherwise, you need the preprocessing to happen (immediately) before you start the task.

So, suppose I have 2 compilers:

Compiler Setup time
1 5
2 7

And 4 tasks:

Task Length Deadline Compiler
1 5 10 1
2 3 20 2
3 3 23 2
4 10 100 1

..Then if we schedule task 1 first, its setup + length gets it done at time 10.  Then we do the setup for task 2, so it finishes at time 20.  Task 3 uses the same “compiler”, so does not need to do setup time, so will finish at time 23.  Task 4 uses a different compiler, so needs to re-run compiler 1’s setup time of 5, and will be done at time 38.

Notice that the most “efficient” schedule that minimizes the number of setups we have to do will not schedule all tasks by their deadlines.

Reduction: I’m pretty sure this problem is Theorem 1 in the paper by Bruno and Downey.  They say they reduce from Knapsack, but since there aren’t any weights, they’re really using Sum of Subsets.  The original set S has N elements, a target B, and the sum of all elements in the set is called t0 (which will, of course, be > B).  There will be N+2 “classes” of 3 tasks each that share the same setup task.  The setup for all classes Ci takes time 1 and has 3 tasks: one that takes time 1, one that takes time si (the corresponding element in S), and a third task that takes time H*si, where H is the larger of t0 and N+2.  The 2 “extra” classes Co and Cn+1 work similarly, but use t0 instead of si.  The deadline for the first task in all classes is d1 = 2(N+2) + B.  The deadline for the second task in all classes is d2 = 3N+5+3t0+2Ht0-HB.  The deadline for the third task in all classes is d3=3N+5+3t0+3Ht0.

They go on to show the following facts:

  • In a feasible schedule, you can’t have a d3 task finish before the d1 deadline (there is no time for it).
  • In a feasible schedule, you can’t finish the second task of C0 or Cn+1 before d1.  (Since the second task takes t0 time, feasibly scheduling everyone’s first tasks with their setup times does not leave enough time left).
  • In a feasible schedule, we can only setup 2N+3 setup tasks at most. (More leaves too much time for setup and processing and someone will miss their d3 deadline).
  • In a feasible schedule, there is exactly one task that gets setup once (that is, it gets set up, and does its three tasks in sequence).  This is because of the first bullet and the fact that we have a limit of setup tasks.
  • In a feasible schedule, you never do the first task then the third task without doing the second task  (You can rearrange other feasible schedules to have this property and remain feasible).

What results from these facts is that a feasible schedule has: a bunch of first tasks (preceded by the setup task), a bunch of first and second tasks in sequence (preceded by the setup task), and a single class that does its setup then all three tasks.  This last sequence crosses over the d1 deadline.  Then we need to set up and schedule the second and third tasks of all of the classes we only did the first task for before the d2 deadline.  Then we will set up and schedule the last task of all classes that have a third task remaining.

It turns out that the classes we chose to do the first 2 tasks before the d1 deadline for have costs (and second task length) of exactly B.

Difficulty: 7.  This is a really good paper that does a really good job of proving each of the above facts along the way and showing how it all works (many other papers would resort to saying some of these proofs were “obvious”, which I don’t think they are).  Having said that, I don’t know how anyone comes up with these classes- especially the deadlines- without weeks of trial and error, and that is too much for a student to manage on their own.

Serializability of Database Histories

This is another problem with a cool elegant reduction once you get past the baggage you need to know to understand the problem.  This database section seems to be full of problems like these.

The problem: Serializability of Database Histories.  This is problem SR33 in the appendix.

The description: We have a set V of variables in our database, and a set T of transactions, where each transaction i has a read operation (Ri) that reads some subset of V, and a write operation Wi that writes some (possibly different) subset of V.  We’re also given a “history” H of T, which permutes the order of all reads and writes maintaining the property that for all i, Ri comes before Wi in the history.  Think of this as a set of parallel transactions that reach a central database.  H is the order the database processes these operations.

Can we find a serial history H’ of T, with the following properties:

  • Each Ri occurs immediately before its corresponding Wi.
  • A “live transaction” is a transaction (Ri, Wi) where either Wi is the last time a variable is written before the Rj of some other live transaction or the last time the variable is written at all.  The set of live transactions in H and H’ needs to be the same.
  • For each pair of live transactions (Ri, Wi) and (Rj, Wj), for any variable v in Wi∩Rj, Wi is the last write set to contain v before Rj in H if and only if Wi is the last write set to contain v before Rj in H’.  The paper says that this means transaction j “reads from” (or “reads v from”) transaction i.

Example: The paper by Papadimitriou, Bernstein, and Rothnie that has the reduction has a good simple example of a non-serializable history:

H= <R1, R2, W2, W1>, where R1 and W2 access a variable x, and R2 and W1 access a variable y.  Both transactions are live since they both write their variables for the last time.  Notice that neither transaction reads any variable.  But the two possible candidates for H’ are: <R1, W1, R2, W2> (where R2 reads the y written by W1) and <R2, W2, R1, W1> (where R1 reads the x written by W2), so neither H’ candidate has the same set of transactions reading variables from each other.

Reduction: Is from Non-Circular Satisfiability.  Given a formula, they generate a “polygraph” of a database history.  A polygraph (N,A,B) is a directed graph (N,A) along with a set B of “bipaths” (paths that are 2 edges long).  If a bipath{(v,u), (u,w)} is in B, then the edge (w,v) is in A.  So, if a bipath exists in B from v to w, then an edge in A exists from w back to v.  This means that we can view a polygraph (N,A,B) as a family of directed graphs.  Each directed graph in the family has the same vertices and an edge set A’ that is a superset of A and contains at least one edge in each bipath in B. They define an acyclic polygraph as a polygraph (represented as a family of directed graphs) where at least one directed graph in the family is acyclic.

In the paper, they relate database histories to polygraphs by letting the vertex set N bet a set of live transactions.  We build edges in A (u,v) from transactions that write a variable (vertex u)  to transactions that read the same variable (vertex v).  If some other vertex w also has that variable in their read set then the bipath {(v,w), (w,u)} exists in B.  So edges (u,v) in A mean that u “happens before” v since u writes a variable that v reads.  A bipath {(v,w), (w,u)} means that w also reads the same variable, so must happen before u or after v.  They show that a history is serializable if and only if the polygraph for the history is acyclic.

So, given a formula, they build a polygraph that is acyclic if and only if the formula is satisfiable.  The polygraph will have 3 vertices (aj, bj, and cj) for each variable xj in the formula.  Each a vertex connects by an edge in A to its corresponding B vertex.  We also have a bipath in B from the b vertex through the corresponding c vertex back to the a vertex.

Each literal Cik  (literal #k of clause i) generates two vertices yik and zik.  We add edges in A from each yik to zi(k+1)mod 3  (in other words, the y vertex of each clause connects to the “next” z vertex, wrapping around if necessary).  If literal Cik is a positive occurrence of variable Xj, we add edges (cj, yik) and (bj, zik) to A, and the bipath {(zik, yik), (yik, bj)} to B.  If the literal is negative, we instead add (zik, cj) to A and {(aj, zik), (zik, yik)} to B.

If the polygraph is acyclic (and thus the history it represents is serializable), then there is some acyclic digraph in the family of directed graphs related to the polygraph.  So the bipath {(bj, cj), (cj, aj)} will have either the first edge from b-c (which we will represent as “false”) or will have the second edge from c-a (which we will represent as “true”).  (The directed graph can’t have both because its edges are a superset of A, which means it has the edge (aj, bj) and taking both halves of he bipath will cause a cycle).

If our acyclic directed graph has the “false” (b-c) version of the edge for a literal, then it also has to have the z-y edge of the bipath associated with the literal (otherwise there is a cycle).  If all of the literals in a clause were set to false, this would cause a cycle between these bipath edges and the y-z edges we added in A for each clause.  So at least one literal per clause must be true, which gives us a way to satisfy the formula.

If the formula is satisfiable, then build the acyclic digraph that starts with all of A, and takes the bipath edges corresponding to the truth value of each variable, as defined above.  This implies ways you need to take the edges from the bipaths for the literals, to avoid cycles.  The only way now for the graph to be acyclic is for there to be a cycle of x’s and y’s in the edges and bipath edges.  But that would imply that we’ve set all of the literals in a clause to false. Since we know that the clause can be made true (since the original formula is satisfiable), we know that a way exists to make the directed graph acyclic.

Difficulty: 7.  It takes a lot of baggage to get to the actual reduction here, but once you do, I think it’s pretty easy and cool to see how the cycles arise from the definitions of the graphs and from the formula.

Protected: Tableau Equivalence

This content is password protected. To view it please enter your password below:

Protected: Expected Retrieval Cost

This content is password protected. To view it please enter your password below:

Protected: Dynamic Storage Allocation

This content is password protected. To view it please enter your password below:

Protected: Minimum Separating Set

This content is password protected. To view it please enter your password below:

Protected: Minimum Test Set

This content is password protected. To view it please enter your password below:

Protected: Intersection Graph For Segments On A Grid

This content is password protected. To view it please enter your password below: