Category Archives: Core Problems
Hamiltonian Path
Before I start doing my own problems where G&J don’t give proofs, here’s a really simple extension of Hamiltonian Circuit that will be useful.
The problem: Hamiltonian Path
The description: Given a graph G=(V,E), where |V| = N. Can I create an ordering <v1, v2…vn>of the vertices where an edge exists between each (vi, vi+1) in the sequence?
(This is the Hamiltonian Circuit problem without the restriction that I return to the start)
EDIT 4/24/15. My original reduction doesn’t work. I’ve left it below. Thanks to Fernando Berzal for pointing it out to me. I’ve added his answer, which looks correct.
In all honesty, the best part about being wrong is that it means people actually read these things!
The new Reduction: From HC. Given a graph G =(V,E), an instance of HC, pick any vertex v.
We’re going to build a graph G’ =(V’, E’)
V’ = All of the vertices in V, plus a new vertex v’ (which is a “copy” of v)
E’ = All of the edges in E, plus an edge (v’, w) if there was an edge (v,w) in E.
So, G’ is G, with one vertex (v) “duplicated” as v’. v’ has edges to all of the places v has.
We want to see if there’s an HP from v to v’.
If G’ has an HP from v to v’, then the last edge in the path goes from some vertex u to v’. If we replace that edge with the edge (u,v) we get a HC in G.
If G has an HC, then we can follow that cycle starting from v. Instead of the last edge coming back to v, follow it to v’. This will give us a HP in G’
My reduction that doesn’t work: From HC. Given a graph GHC, an instance of HC, we build GHP to be the exact same graph with 2 vertices added- vstart and vend. We add two edges, one from vstart to some vertex in GHC, and one from vend to the same vertex in GHC. The idea is that a Hamiltonian path in this graph would start at vstart, travel through the HC in the given graph (if it exists), and end at vend.
This doesn’t work because if the two new vertices both connect to the same vertex then we hit it twice in our HP, which breaks the rules.
Difficulty: 2. This is a good “simple reduction” for students to work on. I don’t use it because it’s just a variant on the same problem, and I like to drive home how reductions tie very different-looking problems together. But I do like how it’s simple, but does make you make some changes to the original problem.
(I think that I’m going to upgrade this difficulty to 4 now, since I can see a lot of students making the same mistake I did)
Partition
First, an administrative note. I wanted to call this site “Annotated NP-Complete Problems”, because the idea is that I’m going through the Garey&Johnson book and adding notes to each problem talking about how to do the reduction and how applicable it is for student assignments. But that name is sort of taken already, and I don’t want to step on any toes or cause any confusion. So I figured that I’d change the title now, before anyone finds out about the site.
And as I’ve been writing, these notes feel more like “discussions” than “short annotations” anyway, so I think this is a better title.
This is the last of the “core six” problems in the G&J book, as defined in Chapter 3. There are several other problems presented in that chapter, with proofs, but since the point of this exercise is to get to the problems without proofs, I’m going to skip over them, and come back to them only if I need to (because they’re the basis for a future reduction, for example).
The problem: Partition (PART)
The definition: Given a set of integers A, can I fid a subset A’⊆A such that the sum of all of the elements in A’ is exactly half the sum of all of the elements in A?
(Alternately, given a set of integers A, can I split all of the elements of A into two subsets B and C, such that the sum of all of the elements in B is equal to the sum of all of the elements in C?)
(Alternately (this is the G&J definition), given any set A, where each element a∈A has a size s(z) that is a positive integer, can we find a subset A’ of A where the sum of the sizes of everything in A’ is exactly equal to the sum of the sizes of everything in A-A’?)
Example: Suppose A was the set {1,2,3,4,6}. Then A’ could be {1,3,4}, forming a partition (A’ and A-A’ both sum to 8). If instead, A was the set {1,2,3,4,5}, then no possible partition exists. This may seem like a silly case (any set A where the sum of the elements is odd has no partition), but even if the sum of the elements of A is even, it’s possible that no partition exists- for example if A is {2,4,100}.
Note: Partition is one of my favorite NP-Complete problems, because the description is so easy, and it seems so simple. It’s probably my go-to example if I want to explain this stuff to non-technical people in under a minute- pretend that the elements of A are weights, and the value of each element is the weight in ounces. The partition problem asks “given this group of weights, and a “scales of justice” style scale, can you make the scale balance using all of the given weights?”. It’s pretty surprising to most people that the only known way to answer that question basically boils down to “try all possible combinations of arranging things on the scale”
The reduction: From 3DM. G&J provide the reduction on pages 61-62. The basic idea is to have one element in A for each element in M, and to represent the elements in A as binary numbers that have 1’s in positions corresponding to which element from W, X, and Y we get the triple from. The number is set up so that we have a maximum possible value of the sum of all of these elements in A. They then add two “extra” elements to A so that each side of the partition (elements of M that make a matching, and everything else) will add up to the same value
Difficulty: 7. The idea of using binary numbers to keep track of “positions” in a list is tricky, but comes up in lots of places. If students have seen that idea before, this becomes a hard but doable problem. If students haven’t seen that idea before, I’d make this a 9.
Hamiltonian Circuit
This is one of my favorite problems, but one of the worst reductions to understand. I usually skip over it when I teach it, showing them the picture of the widget and saying “See? It’s Nuts!” and moving on
The problem: Hamiltonian Circuit (HC)
The definition: Given a graph G=(V,E), where |V| = N. Can I create an ordering <v1, v2…vn>of the vertices where an edge exists between each (vi, vi+1) in the sequence, and the edge (vn, v1) also exists?
(In other words, does a sequence of edges exist that lets me start at some vertex, visit each other vertex exactly once, and return to the start?)
Example:
The sequence <a,d,h,k,g,c> forms a Hamiltonian Circuit
The reduction: Is from VC, and it’s nuts. G&J do it on pages 56-60. Starting with an instance of VC (A graph G, and an integer K), build a new graph G’ with:
- K “selector” vertices (a1-ak)
- 12 vertices and 14 edges (see page 57) for each edge in G (a “cover-testing” component
- Edges to connect the selector vertices to the start and end of the cover-testing component.
The Cormen algorithms book has an example of this, and it takes a graph with 4 vertices and 4 edges and builds a HC instance with 50 vertices and 72 edges.
Difficulty: 9, only because I’m sure there exist reductions that have like 25 page proofs, and I want to save my 10 for that. But really, I think just making the students trace this reduction on an example is a really hard problem.
Independent Set
The last of the three similar graph problems, it’s also probably the easiest reduction. When I teach NP-Completeness, I’ve had a lot of success starting with this reduction (instead of building a stable of NP-Complete problems from zero like the textbooks do). You kind of have to say “Trust me, Vertex Cover is NP-Complete, I’ll show you why later”, but starting with a simple reduction means that they can follow the idea and the process right from the beginning without having to be bogged down in ugly conversions involving SAT.
The problem: Independent Set (IS)
The definition Given a graph G=(V,E) and an integer J. Does G have a collection of at least J vertices where no edges connect any two vertices in J?
Example: Using the same graph:
{a,h,k} is an independent set of size 3.
The reduction: From VC. Use the exact same graph, and set J=|V|-K. (G has an independent set of J vertices exactly when it has a vertex cover of K vertices. Vertices not in the cover can’t have edges between them).
Difficulty: 1. There’s a reason why I do this first (though sometimes I go from Clique instead). In general, though, I try to avoid making them do problems that are too easy for homeworks and tests. If the reduction is trivial, it makes showing that the conversion solves the original problem difficult. You get a lot of proofs that basically say “C’mon! They’re the same thing!”. Which is not what I’m looking for 🙂
Clique
G&J don’t even bother to prove Clique is NP-Complete, just stating that it (and Independent Set) is a “different version” of Vertex Cover. But the problem comes up often enough that it’s worth seeing in its own right.
The problem: Clique (I’ve also seen “Max Clique” or “Clique Decision Problem” (CDP))
The definition: Given a graph G=(V,E), and a positive integer J (G&J say J needs to be ≤ |V|, but I think you can allow large values and make your algorithm just say “no”). Does J contain a subset V’ of V, or size at least J, where every two vertices in V’ are joined by an edge in E?
Example: Using the graph we had for VC:
{a,c,d} form a clique of size 3. There are no cliques of size 4, but if we add the edge (d,g) and the edge (c,h), then the vertices {c,d,g,h} would be one.
The reduction: From Vertex Cover. Given a graph G and an integer K that is an instance of Vertex Cover, take the complement of G (the complement of a graph has the same vertices, but the set of edges is “inverted”: (u,v) is an edge in the complement if and only if it was not an edge in G), and use it as your instance of CDP. Set J= |V|-K
Difficulty: 3. It’s not a 2 because it takes a little thought to see why the complement works.
Note: As stated perviously, I teach this course using the Cormen book, so in class I show Clique is NP-Hard by reducing from 3SAT. It’s a good example of a hard reduction, probably about as hard as the VC reduction G&J do.
Vertex Cover
Back to the “core 6” with a classic problem from graph theory. There are lots of similar graph problems, so it’s important to keep them all straight.
The Problem: Vertex Cover (VC)
The Definition: Given a graph G=(V,E) and a positive integer K (G&J say that K ≤ |V|, but really, if it’s bigger than that, you can have the algorithm just return “yes”). Can I find a subset V’ of V, |V’| ≤ K, where every edge in E has at least one endpoint in V’?
Example: Hmm, I wonder if I can insert an image..
Ok, looks like that worked. In the graph above, {c,d,g} is a vertex cover, because all edges have at least one endpoint in the set {c,d,g}. As we’ll see later, a graph containing a clique (complete subgraph) of N nodes will need at have at least N-1 of those vertices in the cover.
The reduction: From 3SAT. G&J on pages 54-56 show the construction. We build a graph with vertices for all variables and their negation, with an edge between them. We have three variables for each clause, connected in a triangle. (The three variables correspond to “positions” in the clause). Then there is an edge from each vertex corresponding to a literal (or its negation) to its corresponding “clause vertex”- the vertex that holds the position of that variable in the original formula. K= # of variables + 2* number of clauses.
Difficulty: 7 , assuming it’s the first time a student has seen this sort of reduction. If they have seen something similar (for example, a reduction from 3SAT to Clique), then maybe a 5. The ideas are pretty similar.
Note: I teach my algorithms class using the Cormen book. In it, they reduce 3SAT to Clique, proving Clique is NP-Complete, and then reduce Clique to VC. This works in the exact same way as the reduction from VC to Clique that I’ll be doing here next.
Exact Cover by 3-Sets
This is not one of the “core six”, but it is used a lot in reductions, so it’s worth including since it builds right off of 3DM
Also, I think I will include examples for lots of these problems. Lots of times I have trouble parsing the problem description, so creating a concrete example is helpful.
The problem: Exact Cover by 3-Sets (X3C)
The definition: Given a set X, with |X| = 3q (so, the size of X is a multiple of 3), and a collection C of 3-element subsets of X. Can we find a subset C’ of C where every element of X occurs in exactly one member of C’? (So, C’ is an “exact cover” of X).
Example: Suppose X was {1,2,3,4,5,6}
If C was {{1,2,3},{2,3,4},{1,2,5},{2,5,6}, {1,5,6}} then we could choose C’ to be {{2,3,4},{1,5,6}} as an exact cover because each element in X appears exactly once.
If instead, C was {{1,2,3},{2,4,5},{2,5,6}}, then any C’ we choose will not be an exact cover (we need all 3 subsets to cover all elements in X at least once, but then the element 2 appears three times).
Note that if we do have an exact cover, C’ will contain exactly q elements.
The proof:G&J prove this “by restriction” which basically means that they show how X3C is a more general version of 3DM . If you view an instance of 3DM as a special case of X3C by letting XX3C = W∪X3DM∪Y and C = all q3 triples taking one element from W, one element from X3DM, and one element from Y), then the C’ you get as a solution to X3C is also a matching for 3DM.
(Note that lots of these reductions will be between problems that use the same symbols in both problems. I’ll do my best to disambiguate by using subscripts to mark where the common letter comes from. So, in this case X3DM is the set X that we get from the 3DM problem (one of the 3 input sets), and XX3C is the set we build for the X3C problem (the set we need to cover). Hopefully that doesn’t make things more confusing)
(Also note that like many (most?) Computer Science people, I’m a big fan of nested parentheses. I’m sure all of you can follow along with that. (Right?))
Personally, I don’t like proofs by restriction as a way to teach this stuff to students. It’s very easy to mess up the “this is a special case” argument- you get incorrect arguments like “this is just a special case of SAT where we return true wherever there’s a cover and false when we don’t!”. Also it feels like you’re going backwards from a real reduction, and since getting the direction wrong is probably the #1 most common issue students have in doing reductions, anything that makes their job harder isn’t a great idea.
If I taught this in a class, I’d make them do a proper reduction out of it- start with an instance of 3DM (W, X3DM, and Y), and build an instance of X3C (creating XX3C and C), and going from there.
Difficulty: 3. It’s only not a 2 because I, at least, have trouble understanding what the X3C and 3DM problems are asking. It’s not as straightforward to explain as many other problems
3-Dimensional Matching
Last one for today. I’m not sure how exactly I’ll space these out. We’ll see.
The Problem: 3-Dimensional Matching (3DM)
The Definition: Given disjoint sets W,X,Y all with the same number (q) of elements, and a set M of triples from WxXxY. Is there a subset M’ of M of size q where no two elements of M’ agree in any coordinate?
I always have trouble picturing what this means, so here’s an example:
Let W = {1,2,3,4}, X = {10,20,30,40}, Y = {100,200,300,400}. q is 4 in this case.
M is a set of triples like (1,30,400), (2,20,100), and so on, with one element from each of W,X, and Y.
We’re asking for q elements of M that have no duplicates in any coordinate. In other words, each of {1,2,3,4} appears in exactly one element of M’, each of {10,20,30,40} appears in exactly one element of M’, and so on. So, a legal M’ could be:
{(1,20,300), (2,30,400), (3,40,100), (4,10,200)}
..assuming all 4 of those triples happened to be in the M we were given.
The Reduction: From 3SAT. Pages 50-52 explain how we take the clauses and build W, X, Y, and M from them. It’s pretty crazy-M contains 3 kinds of triples for 3 different kinds of jobs, build according to 3 different sets of rules.
Difficulty: 8. I couldn’t imagine a student coming up with this without a lot of hand-holding.
3-Satisfiability
The first of the “core six” problems, and the first NP-Complete problem that’s actually a reduction from a known NP-Complete problem.
The Problem: 3-Satisfiability (3SAT)
The Definition: (p.46) Given a set of variables U, and a set C of clauses, where each clause has exactly 3 elements, can we assign true/false values to the variables in U that satisfies all of the clauses in C?
(Note that just as the Satisfiability problem was really “CNF-SAT”, most people would call this problem “3-CNF-SAT”)
The Reduction: From SAT. Pages 48-49 give a simple algorithm to convert arbitrarily-sized clauses into clauses of size 3, possibly by adding some extra variables.
Difficulty: 6. It’s pretty fiddly to come up with the conversions in all cases. But this really depends on the students’ familiarity with Boolean logic.
As an aside, I find myself compelled to note that while 3-CNF-SAT is NP-Complete, the similar problem of 2-CNF-SAT (where all clauses are of size 2) can be solved in polynomial time. To me, one of the coolest things in Computer Science is how such small changes to a problem (like 3-CNF to 2-CNF, or Fractional Knapsack to 0/1 Knapsack-where the solution space shrinks!) can drastically affect the complexity of the problem.