
Recent Posts
 Single Execution Time Scheduling With Variable Number of Processors February 16, 2018
 Sequencing to Minimize Maximum Cumulative Cost February 2, 2018
 Register Sufficiency January 24, 2018
 Sequencing With Deadlines and Setup Times January 17, 2018
 Knapsack January 10, 2018
Recent Comments
Archives
 February 2018
 January 2018
 December 2017
 November 2017
 October 2017
 September 2017
 August 2017
 July 2017
 June 2017
 May 2017
 April 2017
 March 2017
 February 2017
 January 2017
 December 2016
 November 2016
 October 2016
 September 2016
 August 2016
 July 2016
 June 2016
 May 2016
 April 2016
 March 2016
 February 2016
 January 2016
 December 2015
 November 2015
 October 2015
 September 2015
 August 2015
 July 2015
 June 2015
 May 2015
 April 2015
 March 2015
 February 2015
 January 2015
 December 2014
 November 2014
 October 2014
 September 2014
 August 2014
 July 2014
 June 2014
Categories
 Algebra and Number Theory
 Appendix Automata and Language Theory
 Appendix Mathematical Programming
 Appendix Network Design
 Appendix Program Optimization
 Appendix Sets and Partitions
 AppendixGraph Theory
 AppendixLogic
 Appendix: Sequencing and Scheduling
 Appendix: Storage and Retrieval
 Chapter 3 Exercises
 Core Problems
 Overview
 Problems not in appendix
 Uncategorized
Meta
Tag Archives: Serializability of Database Hstories
Protected: Safety of Database Transaction Systems
Enter your password to view comments.
Posted in Appendix: Storage and Retrieval
Tagged Hitting Set, Safety of Database Transaction Systems, Serializability of Database Hstories
Serializability of Database Histories
This is another problem with a cool elegant reduction once you get past the baggage you need to know to understand the problem. This database section seems to be full of problems like these.
The problem: Serializability of Database Histories. This is problem SR33 in the appendix.
The description: We have a set V of variables in our database, and a set T of transactions, where each transaction i has a read operation (R_{i}) that reads some subset of V, and a write operation W_{i} that writes some (possibly different) subset of V. We’re also given a “history” H of T, which permutes the order of all reads and writes maintaining the property that for all i, R_{i} comes before W_{i} in the history. Think of this as a set of parallel transactions that reach a central database. H is the order the database processes these operations.
Can we find a serial history H’ of T, with the following properties:
 Each R_{i} occurs immediately before its corresponding W_{i}.
 A “live transaction” is a transaction (R_{i}, W_{i}) where either W_{i }is the last time a variable is written before the R_{j} of some other live transaction or the last time the variable is written at all. The set of live transactions in H and H’ needs to be the same.
 For each pair of live transactions (R_{i}, W_{i}) and (R_{j}, W_{j}), for any variable v in W_{i}∩R_{j}, W_{i} is the last write set to contain v before R_{j} in H if and only if W_{i} is the last write set to contain v before R_{j} in H’. The paper says that this means transaction j “reads from” (or “reads v from”) transaction i.
Example: The paper by Papadimitriou, Bernstein, and Rothnie that has the reduction has a good simple example of a nonserializable history:
H= <R_{1}, R_{2}, W_{2}, W_{1}>, where R_{1} and W_{2} access a variable x, and R_{2} and W_{1} access a variable y. Both transactions are live since they both write their variables for the last time. Notice that neither transaction reads any variable. But the two possible candidates for H’ are: <R_{1}, W_{1}, R_{2}, W_{2}> (where R_{2} reads the y written by W_{1}) and <R_{2}, W_{2}, R_{1}, W_{1}> (where R_{1} reads the x written by W_{2}), so neither H’ candidate has the same set of transactions reading variables from each other.
Reduction: Is from NonCircular Satisfiability. Given a formula, they generate a “polygraph” of a database history. A polygraph (N,A,B) is a directed graph (N,A) along with a set B of “bipaths” (paths that are 2 edges long). If a bipath{(v,u), (u,w)} is in B, then the edge (w,v) is in A. So, if a bipath exists in B from v to w, then an edge in A exists from w back to v. This means that we can view a polygraph (N,A,B) as a family of directed graphs. Each directed graph in the family has the same vertices and an edge set A’ that is a superset of A and contains at least one edge in each bipath in B. They define an acyclic polygraph as a polygraph (represented as a family of directed graphs) where at least one directed graph in the family is acyclic.
In the paper, they relate database histories to polygraphs by letting the vertex set N bet a set of live transactions. We build edges in A (u,v) from transactions that write a variable (vertex u) to transactions that read the same variable (vertex v). If some other vertex w also has that variable in their read set then the bipath {(v,w), (w,u)} exists in B. So edges (u,v) in A mean that u “happens before” v since u writes a variable that v reads. A bipath {(v,w), (w,u)} means that w also reads the same variable, so must happen before u or after v. They show that a history is serializable if and only if the polygraph for the history is acyclic.
So, given a formula, they build a polygraph that is acyclic if and only if the formula is satisfiable. The polygraph will have 3 vertices (a_{j}, b_{j}, and c_{j}) for each variable x_{j} in the formula. Each a vertex connects by an edge in A to its corresponding B vertex. We also have a bipath in B from the b vertex through the corresponding c vertex back to the a vertex.
Each literal C_{ik} (literal #k of clause i) generates two vertices y_{ik} and z_{ik}. We add edges in A from each y_{ik} to z_{i(k+1)mod 3} (in other words, the y vertex of each clause connects to the “next” z vertex, wrapping around if necessary). If literal C_{ik} is a positive occurrence of variable X_{j}, we add edges (c_{j}, y_{ik}) and (b_{j}, z_{ik}) to A, and the bipath {(z_{ik}, y_{ik}), (y_{ik}, b_{j})} to B. If the literal is negative, we instead add (z_{ik}, c_{j}) to A and {(a_{j}, z_{ik}), (z_{ik}, y_{ik})} to B.
If the polygraph is acyclic (and thus the history it represents is serializable), then there is some acyclic digraph in the family of directed graphs related to the polygraph. So the bipath {(b_{j}, c_{j}), (c_{j}, a_{j})} will have either the first edge from bc (which we will represent as “false”) or will have the second edge from ca (which we will represent as “true”). (The directed graph can’t have both because its edges are a superset of A, which means it has the edge (a_{j}, b_{j}) and taking both halves of he bipath will cause a cycle).
If our acyclic directed graph has the “false” (bc) version of the edge for a literal, then it also has to have the zy edge of the bipath associated with the literal (otherwise there is a cycle). If all of the literals in a clause were set to false, this would cause a cycle between these bipath edges and the yz edges we added in A for each clause. So at least one literal per clause must be true, which gives us a way to satisfy the formula.
If the formula is satisfiable, then build the acyclic digraph that starts with all of A, and takes the bipath edges corresponding to the truth value of each variable, as defined above. This implies ways you need to take the edges from the bipaths for the literals, to avoid cycles. The only way now for the graph to be acyclic is for there to be a cycle of x’s and y’s in the edges and bipath edges. But that would imply that we’ve set all of the literals in a clause to false. Since we know that the clause can be made true (since the original formula is satisfiable), we know that a way exists to make the directed graph acyclic.
Difficulty: 7. It takes a lot of baggage to get to the actual reduction here, but once you do, I think it’s pretty easy and cool to see how the cycles arise from the definitions of the graphs and from the formula.