Tag Archives: Difficulty 8

String-To-String Correction

This one reminds me of an old puzzle problem.

The problem: String-To-String Correction.  This is problem SR20 in the appendix.

The description: Given two strings x, and y over a finite alphabet ∑, and an integer K.  Can we start from x and derive y in K or fewer steps, where each step is either deleting a symbol or interchanging a symbol.

Example: The puzzles I’m thinking of are the ones that say “Can you go from HELLO to OLE in 6 steps?”

  • HELLO
  • ELLO
  • ELO
  • EOL
  • OEL
  • OLE

The reason this is a hard problem is when you have repeated symbols, and need to “choose” which ones to delete.  A slightly harder example is to go from ABABABABBB to BAAB.  Now which A’s and B’s you choose to delete have an effect on the total number of moves.

The paper by Wagner that has the reduction makes the point that if you replace “Delete” with “Insert”, you get basically the same problem but go from y to x instead of from x to y.  The paper gives other variants that have polynomial solutions (allowing insertions and deletions and changes of a character, whether or not you allow interchange).

Reduction: Wagner uses Set Covering.  So we’re given a collection of sets c1..cn all subsets of some set W, and an integer K.

Define t = |W| and r = t2.  Pick 3 symbols Q,R, and S that are not in W.  The string x  will be:

QrRciQrSr+1

..concatenated together each time for each ci.  (So the copy that has c1 is first, then another copy with c2, and so on.

The string y will be n copies of the substring RQr, followed by w, followed by n copies of the substring Sr+1

The K’ for this problem is (K+1)*r-1 + 2t(r-1) + n(n-1)(r+1)2/2 + d.

d is r*n+|c1…c2| – |W|.

From here, Wagner explains that the way the correction problem works is by deciding which copies of the R, Qr and Sr+1  sets in x match to the corresponding ones in y.  By counting all of the operations that “have” to be done (by deleting or moving characters), he shows that the a correction problem that has weight of K’ has to cross over certain ci sets to make a covering, and that there have to be K or less of those sets.

Difficulty: 8.  The construction is pretty crazy.  You can eventually see where the components come from, but I can’t imagine how he came up with these things (especially the x and y strings) in the first place.

Protected: Shortest Common Supersequence

This content is password protected. To view it please enter your password below:

Protected: Pruned Trie Space Minimization

This content is password protected. To view it please enter your password below:

Protected: Numerical 3-Dimensional Matching

This content is password protected. To view it please enter your password below:

Protected: 3-Matroid Intersection

This content is password protected. To view it please enter your password below:

Protected: Comparative Divisibility

This content is password protected. To view it please enter your password below:

Protected: Intersection Pattern

This content is password protected. To view it please enter your password below:

Protected: Set Basis

This content is password protected. To view it please enter your password below:

Protected: Planar 3-Satisfiability

This content is password protected. To view it please enter your password below:

Protected: Maximum Fixed-Length Disjoint Paths

This content is password protected. To view it please enter your password below: