This problem is from the same reference as last time’s but is a bit harder.

**The problem: **Second Order Instantiation. This is problem LO19 from the appendix.

**The description: **Given two second order logic expressions E_{1} and E_{2}. E_{1} can have variables, E_{2} can’t. Can we make a substitution of the variables in E_{1} to get an expression identical to E_{2}?

**Example: **In second order predicate logic, functions can be variables. So we could have E_{1} be P(X) ∧ P(Y), where P is a function variable, and X and Y are “regular” boolean variables.

If E_{2} was f(a) ∧ f(b) (where all of f, a, and b are not variables), then the substitution {P/f, X/a, Y/b} would work. But if E_{2} was f(a) ∧ g(b), then no substitution would work.

**Reduction: **We again use the Baxter thesis that we used in the last problem to help us here. We start with a SAT formula S (though really, there’s no reason why we couldn’t just use 3SAT) with n clauses C_{1} through C_{n} and m variables P_{1} through P_{m}. E_{1} will contain one variable for each variable P_{i} in S, and one variable for each literal x_{i,j} (clause i, position j in the clause).

We also create second-0rder variables g_{i}, one for each clause in S. Each function g_{i} has degree equal to the number of literals in clause i.

E_{2} will be built out of the constant symbols T and F, and constant functions G_{i}, each G_{i} has degree equal to the number of literals in clause i.

We will also have an additional function H of degree 2, and K of degree 2n.

From each clause i, we build an expression e_{i}‘ = H(g_{i}(T,..T), g_{i}(F,…F))

(Each g_{i} has as many parameters as there are literals in clause i that are all either T or F)

We also build an expression E_{i}‘ = H(T,F)

For each clause i, literal j, define the expression e_{i,j}”’ = g_{i}(x_{i,j}, …, p_{i,j}, …, x_{i,j})

(All of the parameters are x_{i,j} except for the jth one, which is the variable for p_{i,j})

We build the expression e_{i}” = G_{i}(e_{i,j}”’, …)

(Each G_{i} has as many parameters as there are literals in clause i)

We also build an expression E_{i}” = G_{i}(…)

Each parameter j is T if literal j if clause i is a positive occurrence of a variable, or it’s F if literal j of clause i is a negated occurrence of a variable.

We build E_{1} = K(e_{1}‘, …, e_{n}‘, e_{1}”, …, e_{n}”)

E_{2} = K(E_{1}‘, …, E_{n}‘, E_{1}”, …, E_{n}”)

Baxter was again good enough to give us an example.

Suppose S was: P ∧ (~P∨Q) ∧ ~R ∧ (~Q∨R∨~P). We have 3 variables and 4 clauses. So

E_{1} = K(H(g_{1}(T), g_{1}(F)),

H(g_{2}(T,T), g_{2}(F,F)),

H(g_{3}(T), g_{3}(F)),

H(g_{4}(T,T,T), g_{4}(F,F,F)),

G_{1}(g_{1}(p)),

G_{2}(g_{2}(p, x_{2,1}), g_{2}(x_{2,2}, q)),

G_{3}(g_{3}(r)),

G_{4}(g_{4}(q, x_{4,1}, x_{4,1}), g_{4}(x_{4,2}, r, x_{4,2}), g_{4}(x_{4,3}, x_{4,3}, p)))

E_{2} = K(H(T,F), H(T,F), H(T,F), H(T,F),

G_{1}(T), G_{2}(F,T), G_{3}(F), G_{4}(F,T,F))

Let’s look at G_{1} to see how the substitution works. We need G_{1}(g_{1}(p)) in E_{1} to map to G_{1}(T) in E_{2}. So we need g_{1}(p) to map to T. From the first H function in E_{1}, we also need g_{1}(T) to map to T. The only way to make both substitutions true is if the variable p is mapped to (or assigned to, in our SAT instance) to T.

Now let’s look at G_{4}. We need the following three mappings to exist:

- g
_{4}(q, x_{4,1}, x_{4,1}) -> F - g
_{4}(x_{4,2}, r, x_{4,2}) -> T - g
_{4}(x_{4,3}, x_{4,3}, p) ->F

..we already know from the H function that g_{4}(T,T,T) maps to T and g_{4}(F,F,F) maps to F. We already know from our G_{1} discussion that p has to map to T. For g_{4}(x_{4,3}, x_{4,3}, p) to map to F, we’ll need x_{4,3} to map to F (and then we have g_{4}(F,F,T) map to F. Notice that because we move the “variable” parameter around g_{4}, none of the other g_{4} clauses will be able to generate (F,F,T) parameters since they all have 2 copies of the same x variable in different places).

The actual proof of this reduction goes on for a couple of pages of the Baxter thesis, but hopefully this explains the basic idea.

**Difficulty: **8. This is harder to follow than last week’s, but the general idea of “binding variables and literals to functions that test for satisfiability” is a pattern we’ve seen several times.