February | 2024 | Discussions of NP-Complete Problems

This is a cool, fun reduction. I also teach And/Or graphs in my Artificial Intelligence class, so it was fun to see this as a problem here.

The problem: Minimum Weight And/Or Graph Solution. This is problem MS16 in the appendix.

Description: An And/Or graph is a directed, acyclic graph, where each vertex is marked as either an “and” node, or an “or” node. We’re given a start vertex, s. Then we “solve” the subtree at s by:

Solving all children of s, if s is an “and” node,
Or, solving one child of s, if s is an “or” node.

The solution proceeds recursively, until we get to a leaf, which can be solved directly. Here is an example from the paper:

So, to solve P1, we could solve either P2, P3, or P7. To solve P2, we’d need to solve both P4 and P5.

In this problem, the edges are weighted, and we’re asked: Given an integer K, is there a solution that traverses edges costing K or less?

Example: In the graph above, the cheapest solution costs 3: Solve P1 by solving P3, solve P3 by solving P6. If P3 was an “and” node, instead of an “or” node, then the best solution would cost 4: Solve P1 by solving P2, solve P2 by solving both P4 and P5.

Reduction: Sahni’s paper has a bunch of reductions we’ve done. This time, he uses 3SAT. So, from a given 3SAT instance, we build a graph:

S is the root node, and is an “and” node
S has one child for each variable (x₁..x_n). There is also one extra child, P.
Each x_i is an “or” node with 2 children, one for “true”, one for “false”. (So 2n children total)
Each of these 2n nodes have 1 edge coming out of it to one of the 2n leaves of the tree.
P is an “and” node, and has one child for each clause (C_1..C_k).
Each clause node is an “or” node, and has 3 edges corresponding to its literals. So if variable x_i appears in clause C_j positively, there is an edge to the “true” child of x_i. If the negation ~x_i appears, there is an edge to the “false” child of x_i.
The weights of each edge are 0, except for the 1-degree edges going to the leaves, which cost 1.

To solve this graph, we need to solve S, which means we need to solve each variable (by “choosing” a truth setting for it), and we also need to solve P. Solving P means we need to solve each clause. Each clause is solved by choosing a truth setting for one of its literals that makes the clause true. Since we are already choosing a truth setting for each variable, we are paying N in cost. If we can choose setting of the variables that also solve each clause, we pay no more than N in cost, and also satisfy each clause. If the formula is unsatisfiable, then we need to solve one variable “twice” by setting it to true and false, making us pay more than N overall.

So the formula is satisfiable if and only if the tree cost is exactly N.

Difficulty: 6. I don’t know that I’d expect a student to come up with this themselves. It is similar to a lot of the 3SAT reductions we’ve done, though (one “thing” per clause, one “thing” for each variable setting, connections between the clause and the settings of its literals), so if a few of these were gone over with students, maybe this could be presented along with them.

We teach decision trees in our AI and Data Analytics classes, so this is a nice problem that relates to that.

The problem: Decision Tree. This is problem MS15 in the appendix.

The description: We’re given a set X of n objects, and a set T of tests. Each test can be applied to each object, and returned true or false. Can we arrange tests so that after at most K tests, we uniquely identify all objects in X?

Example: A decision tree is a way to take an unknown object and classify it into one of the elements of X. So suppose X= {1,2,3,4} We’ll have 3 tests:

Test 1: Is it even?

Test 2: Is the value of the number < 3?

Test 3: Does the spelling of the word have < 4 letters?

Then we could write this decision tree of height 2:

We could also use the same test in multiple nodes. The decision trees I come across also allow for the same object from X to appear in many different leaves (because X is usually a boolean set like {Yes, No}), but the definitions here don’t seem to allow that.

Reduction: The paper from Hyafil and Rivest uses X3C. So we’re given a set of elements X, and a collection C of 3-element subsets of X. We’ll build a set X’ (the objects of the decision tree problem) by taking X and adding 3 new elements: {a,b,c}. We will have 1 test in T for each triple in C (“Is our object one of these 3 elements?”). We’ll also have 1 test for each element x in X’ (“Is our object x?”).

The paper defines a formula that we are trying to minimize for the height of the tree:

$f(n) = \min_{1 \leq i \leq 3} [f(n-i) + f(i) + n]$ if $n \geq 4$

$f(1) =2, f(2) = 2, f(3) = 5, f(4) = 8$

So our K is f(|X’|).

We’ll want the optimal tree to look like this (picture from the paper):

The nodes in the tree above are 3-element sets from C that form the cover. So the first node is asking “Is X in the first set of the cover?”. If the answer is yes, we move to the right, and use the 1-element tests to figure out which of the 3 leaf nodes correspond to the exact element. If the answer is no, we move to the left, and ask the same question of the second set in the cover. The way to minimize the tree’s height is to have the “what 3-element set” questions be first. We also minimize the height by making sure that each element of X’ is covered by a node in the tree- in other words, the $T_{i_k}$ sets have to form a cover. This means we get the minimum height (of K) from the tree if and only if we have a cover from C.

Difficulty: 6. The reduction itself isn’t hard, but the formula is hard to see, and the paper kind of handwaves the “of course you need to be an exact cover” part of the proof. I think if students had to do it, they’d run into some sticky issues.

Discussions of NP-Complete Problems

A Resource for Computer Science Educators

Monthly Archives: February 2024

Minimum Weight And/Or Graph Solution

Decision Tree