Monthly Archives: November 2017

Linear Bounded Automaton Acceptance

The next problem in the appendix is a doozy, and it needs this reduction, which is in Karp’s paper.

The problem: Linear Bounded Automaton Acceptance.  This is problem AL3 in the appendix.

The description: Given a Linear Bounded Automaton L, and a string x, does L accept x?

Example: A well-known context-sensitive language is L= anbncn.  If we had a definition for an LBA that accepts a string L, and a string such as aabbccc, the LBA should say “no” on this instance.  But if our string was aabbcc, the LBA would say “yes”.

Reduction: G&J say this is a “Generic reduction”, and I can see why.  Let me try to add some explanation and put it into something more like what we’re used to:

We start with a known NP-Complete (not NP-Hard) problem.  Let’s say 3SAT. 3SAT is in NP, which means we have a non-deterministic Turing Machine that takes an instance x and solves 3SAT in polynomial time, let’s say p(|x|).   Let’s assume that the alphabet of this Turing Machine is {0,1}.

Given this instance x of 3SAT, we build the following context-sensitive language (and it’s implied that we can go from there to an LBA in polynomial time).  {#p(log(|x|) x #p(log (|x|)} over the alphabet {0,1,#}.  So we have a number of # symbols before and after x equal to the polynomial on the log of the input.

To be honest, I’m not sure why you need the log here.  I think you can get away with just having p(|x|) symbols on each side, and the total length of the LBA acceptance instane is still polynomial in |x|.  The idea is that since we know that the TM completes in time p(|x|), it can only ever move its head p(|x|) tape symbols to the left or right before it runs out of time.  So we can use that as the “bound” on the LBA.

So, if our LBA uses the exact same states and transitions as the non-deterministic Turing Machine that solved 3SAT, we now have our LBA accept x exactly when x was satisfiable.

The reason this is a “generic” reduction is that nothing we did had anything to do with 3SAT specifically.  We could do this process for any problem in NP.  It’s more useful if we start with an NP-Complete problem, but we could do this for things in P as well since they are also in NP.

Difficulty: 5, mainly because this is a weird way of doing things, and Linear Bounded Automata are things that often get short shrift in Theory of Computation classes.

Consistency of Database Frequency Tables

The end of the Storage and Retrieval section is in sight!

The problem: Consistency of Database Frequency Tables.  This is problem SR35 in the appendix.

The description: I find G&J’s definition confusing, so this definition borrows a lot from the paper by Reiss that has the reduction. 

We have a set of “characteristics” (attributes) A.  Each attribute a in A has a domain Da.  A database is a set of “Objects” (tuples) O1..On where each object defines a value for each characteristic. (The result is a two-dimensional table where rows are objects and columns are attributes). We can define a frequency table  for each pair of attributes a and b in A.  The table has |Da| rows and |Db| columns, and each entry (x,y)  is “supposed” to represent the number of tuples in O that have x for its A attribute and y for its B attribute.

What the problem is asking is: Given a set of tables and a database table V, can we find a way to map the attributes in A to the tables such that the tables actually represent the frequencies in the database?

Example:
Since you need a frequency table for each pair of attributes, here is an example with 3 attributes, each taking 2 posible values.  Attribute a’s domain is {0,1}, b’s is {a,b}, and c’s is {>, <}.  Our set of objects is:

  • (0,a,>)
  • (0,b,>)
  • (1,b,>)
  • (0,b,<)

If we are handed a set of 3 frequency tables:

C1 vs C2:

1 0
2 1

C1 vs C3:

0 1
1 2

C2 vs C3:

1 2
0 1

These frequency tables are accurate if C1 is attribute a, C2 is attribute b, and C3 is attribute c.

The reduction: From 3SAT.  (Actually, this might be from regular CNF-SAT,  but there is no reason not to restrict the input to 3 literals per claue).  We take a SAT instance with p clauses and q variables and bake a database with 2 objects per variable (one for positive, one for negative) and p*q extra objects. We have one attribute per clause (“CLi“), one attribute per variable (“VRi“) and “VALUE” (which holds the truth value of the object in the final setting).  The frequency tables are set up to ensure that each variable has one value setting(true or false) and each clause is made true.  The way to make the database consistent with the table is to find a way to map the variables in the database to the tables in a way to make the formula satisfiable.

Difficulty: 5, because the reduction isn’t that hard to follow, or come up with, once you get how the frequency tables work.  But unlike most 5’s, I don’t think I’d assign it as a homework problem, because of how much extra work it would take to explain the frequency tables in the first place.

Safety of Database Transaction Systems

This problem is related to last week’s but turns out to be much harder.

The problem: Safety of Database Transaction Systems.  This is problem SR34 in the appendix.

The description: Given a set of variables V and transactions T as defined in the Serializability of Database Histories problem, is every history H for T equivalent to some serial history?

Example: This is an extension of last week’s problem.  So last week’s example which produced a history that is not serializable means that that set of transactions is not safe.

The easiest way to produce transactions that are safe is to make them access different variables: Suppose I have 2 transactions (R1, W1) and (R2, W2).  If the first transaction reads and writes a variable x and the second transaction reads and writes a variable y, then any ordering of those transactions will be serializable.

Reduction: It’s in the same paper by Papadimitriou,  Bernstein, and Rothnie as the Serializability of Database History problem was.  It’s interesting that they couldn’t show that the problem was in NP.

They reduce from Hitting Set.  They show in the paper how to take a transaction system and build a graph where there is one vertex for each read and write operation in each transaction, and edges between the two operations in a transaction.  There are also edges between operations Ri and Wj or Wi and Wj  if those operations share a variable.  These edges show places where changing the order changes the meaning of a transaction history.  They show that a transaction system is safe if and only if the graph has no cycles containing a (Rj, Wj) edge.  (Note that means that cycles can exist as long as they contain only W-W edges)

So, given an instance of hitting set, (a set S and a collection C of subsets of S), they build a transaction graph: One read vertex for each set in S, plus one write vertex at the end.  Between read vertices Ri and Ri+1 we add |Ci| edges (or, so we still have a simple graph, |Ci| paths containing vertices that don’t appear anyplace else).  At the end of this chain of paths is a single W vertex, with an edge back to R1.  The only unsafe cycle now starts at R1, goes through one of the paths connecting each R vertex, goes to the final W vertex, and then back to R1.

So far, so good.  But then they lose me when they say “We can embed the hitting set problem –among others– in safety by forcing (by the use of sets of singletons) each such choice to correspond to a hitting set”.  I think what they’re saying is that they will create a set of variables corresponding to sets in C such that an unsafe cycle exists if and only if S has a hitting set.  But I’m not sure how they get there- especially in polynomial time.  I’m sure there’s a way, but it reads like “we set it such that it all works”, which isn’t convincing to me.

Difficulty: 9, because I don’t see how they do that last step.  I’m sure a good explanation exists that would make this less difficult. I’ll also say that the reduction also says the transaction system is unsafe “if and only if there exists an unsafe path–and therefore a hitting set”.  Which sounds like a Co-NP proof to me.  I’m probably missing something.