This problem is related to last week’s but turns out to be much harder.
The problem: Safety of Database Transaction Systems. This is problem SR34 in the appendix.
The description: Given a set of variables V and transactions T as defined in the Serializability of Database Histories problem, is every history H for T equivalent to some serial history?
Example: This is an extension of last week’s problem. So last week’s example which produced a history that is not serializable means that that set of transactions is not safe.
The easiest way to produce transactions that are safe is to make them access different variables: Suppose I have 2 transactions (R1, W1) and (R2, W2). If the first transaction reads and writes a variable x and the second transaction reads and writes a variable y, then any ordering of those transactions will be serializable.
Reduction: It’s in the same paper by Papadimitriou, Bernstein, and Rothnie as the Serializability of Database History problem was. It’s interesting that they couldn’t show that the problem was in NP.
They reduce from Hitting Set. They show in the paper how to take a transaction system and build a graph where there is one vertex for each read and write operation in each transaction, and edges between the two operations in a transaction. There are also edges between operations Ri and Wj or Wi and Wj if those operations share a variable. These edges show places where changing the order changes the meaning of a transaction history. They show that a transaction system is safe if and only if the graph has no cycles containing a (Rj, Wj) edge. (Note that means that cycles can exist as long as they contain only W-W edges)
So, given an instance of hitting set, (a set S and a collection C of subsets of S), they build a transaction graph: One read vertex for each set in S, plus one write vertex at the end. Between read vertices Ri and Ri+1 we add |Ci| edges (or, so we still have a simple graph, |Ci| paths containing vertices that don’t appear anyplace else). At the end of this chain of paths is a single W vertex, with an edge back to R1. The only unsafe cycle now starts at R1, goes through one of the paths connecting each R vertex, goes to the final W vertex, and then back to R1.
So far, so good. But then they lose me when they say “We can embed the hitting set problem –among others– in safety by forcing (by the use of sets of singletons) each such choice to correspond to a hitting set”. I think what they’re saying is that they will create a set of variables corresponding to sets in C such that an unsafe cycle exists if and only if S has a hitting set. But I’m not sure how they get there- especially in polynomial time. I’m sure there’s a way, but it reads like “we set it such that it all works”, which isn’t convincing to me.
Difficulty: 9, because I don’t see how they do that last step. I’m sure a good explanation exists that would make this less difficult. I’ll also say that the reduction also says the transaction system is unsafe “if and only if there exists an unsafe path–and therefore a hitting set”. Which sounds like a Co-NP proof to me. I’m probably missing something.