Now we enter the “Database Problems” subsection of the chapter. I’ll admit that I’ve had trouble with a lot of these, partially because databases isn’t an area I know a ton about, but also because the definitions of the problems often obscure what’s really happening.

**The problem: **Minimum Cardinality Key. This is problem SR26 in the appendix.

**The description: **Given a set A of attributes, a collection F of ordered pairs of subsets of A (“functional dependencies”), and a positive integer M, can we find a set K of A attributes or less that form a key for the relational system <A,F>.

G&J defines “form a key” as the ordered pair (K,A) belongs to the closure F* of F, with the following properties:

- F ⊆ F*
- B⊆C⊆A implies (C,B) ⊆ F* (the “projective” closure)
- (B,C) and (C,D) ⊆ F* implies (B,D) ⊆ F* (the “transitive” closure)
- (B.C) and (B,D) ⊆ F* implies (B,C∪D) ⊆ F* (the “additive” closure)

**Exam**ple: The paper by Lucchesi and Osborn that has the reduction has a good example:

Suppose we have a database table for student records. The attribute set A could be {Name, ID, Course, Professor, Time}. Some functional dependencies, written as ordered pairs, could be:

- ({Name},{ID}) (If you know a student’s name, you can figure out their ID)
- ({ID},{Name}) (If you know a student’s ID, you can figure out their name)
- ({Name, Course}, {Professor, Time}) (If you know a student’s name and a course they are in, you can figure out the professor teaching it and what time the course is)
- ({Professor, Time), {Course}) (If you know the professor of the course and the time it is being taught, you can figure out the course being taught. Note that you can’t figure out the student because more than one student will be in the course)

If these relations are F, then we can derive all of the attributes just from knowing the name of the student and the course they are taking. In other words, F* contains the pair ({name, course}, (name, ID, course, professor, time).

**Reduction: **The Lucchesi and Osborn paper use different terminology than G&J’s. They define a relation D, defined in a series of inductive steps, where D[0] is F, and each D[i] is created from D[i-1] by doing one step of one of the closures on the elements in D[i-1]. The final closure (what G&J call F*) they call D.

They also define an expansion of a relation D: Given a subset B of our attribute set A, if a relation (L,R) is in D, where L is in B, but R isn’t, the set B∪R is a D-expansion of B.

They first show that if B is a subset of A, and B’ is a D[0]-expansion of B, then (B,B’) and (B’,B) are in D. They then show that a subset B of A is D[i]-expansible (and thus D-expansible) if and only if it is D[0]-expansible.

Then, they do the reduction from VC. A will have one attribute for each vertex in G. D[0] (G&J’s F) will, for each vertex v, relate the set of v’s neighbors to v. They show that a subset K of A is a key if and only if it’s a vertex cover of V. They do this by induction: if K=A, it’s obviously a cover. If K is smaller than A, then K is a key if and only if there is a D[0] expansion that is also a key. K has a D[0] expansion if and only if there is some vertex v not in K such that all of the vertices adjacent to v are reachable by K. So, K is a key (and also a cover of A).

**Difficulty: **8. There must be an easier way to explain this problem.