Equational Theories

9 Metatheorems from Invariants

For the purposes of this chapter, a theorem is a (true) statement about particular equations, for example ‘(387 implies 43)’ is a theorem. A metatheorem is a general statement about implications; one can usually get many theorems from a single metatheorem. This chapter is all about generating many interesting metatheorems using a meta-metatheorem, called the fundamental property of invariants. If all this is making your head spin, don’t worry. Look at the sections below for examples of metatheorems you can probably agree are both concrete and interesting. Once you have done that, come back here and we will show you how to prove these and other metatheorems using invariants.

9.1 Invariants

Let E,E1, and E2 be equations. If EE1 and E1E2, then EE2. Very trivial. Rephrasing this, we see that if EE1 and EE2, then E1E2.

Extending this idea, suppose we compute the set of all equations which are implied by E; we will call this set Y(E) (we use Y because this is an example of a Yoneda embedding). Then Y(E) is upwards closed, or closed under forward implication: no equation in Y(E) can imply an equation not in Y(E). If we know Y(E) well, this already settles a potentially large number of implications in the negative.

While computing Y(E) for an arbitrary equation E may seem daunting, for some nice equations we can find invariants, which makes the task manageable. An invariant for E is some sort of data associated with expressions w so that

Y(E)={w=wInvariant(w)=Invariant(w)}

If we can find an invariant which is computable for each term w, then we can easily describe Y(E). The fact that Y(E) is upwards closed is rephrased as follows; this is called the fundamental property of invariants. Remember that an invariant is a function taking expressions and outputting some data.

Meta-metatheorem 9.1 Fundamental property of invariants

Let I be an invariant of E. If w=w implies w=w and I(w)=I(w) (that is, E implies w=w), then I(w)=I(w).

More succinctly, for an invariant I of E we must have

(w=ww=w)(I(w)=I(w)I(w)=I(w)).

When using this result, we commonly take the contrapositive: if I(w)=I(w) and I(w)I(w), then w=w cannot imply w=w. Note that the conclusion is independent of the equation E; all we need to know is that I is an invariant.

Note for category theorists

The fundamental property of invariants is not a theorem, nor a metatheorem: it is a meta-metatheorem, in the sense that it will allow us to get a metatheorem for every invariant we find.

Example: absorption law

Let E be the equation xy=x. Intuitively, we must have

Y(E)={w=wthe leftmost variable is the same for w and w}.

We will talk about proving statements like this one (say in Lean) later on; take it as given for now. The invariant is clear: we define I(w) to be the leftmost variable of w. Instantiating this invariant in the fundamental property of invariants, we get the following metatheorem.

Metatheorem 9.2
#

Let w=w be an equation such that the leftmost variable of w is the same as the leftmost variable of w. Then w=w cannot imply an equation that does not have the property from the last sentence.

Example: associativity

For a more complicated example, let E be the associativity equation x(yz)=(xy)z. Intuitively, we must have

Y(E)={equations that, when we remove all parentheses, are of the form w=w}.

There is an invariant lurking behind: it is the (ordered) list of variables appearing in an expression, counting repetitions. More formally, we define I(w) to be the tuple of variables appearing in w, listed from left to right, say. Again, from the fundamental property of invariants we get the following.

Metatheorem 9.3
#

Let w=w be an equation such that the variables appearing in w, taking into account order and repetitions, are the same ones that appear in w. Then w=w cannot imply an equation that does not have the property from the last sentence.

If we were coding a computer program that computes I(w) given w, one could take the string of symbols that is w, ignore all parentheses, replace all symbols by commas, and surround with an appropriate delimiter. (I imagine one could easily do this using regular expressions.)

We can compute other examples, but the invariant can get complicated even for simple equations. Exercise: what is the invariant for commutativity? Answer: To compute I(w) from w replace all parentheses with curly braces and all symbols with commas, and interpret the result as nested sets.

9.2 Expanding the language

The method of invariants really shines when we expand our formal language. Right now our language consists of variables, parentheses, the equal sign, and (there is also an implicit use of but let’s ignore that for now). Let Π denote the preorder of equations (built from the language described) ordered by implication.

We will add the symbol (‘and’) to our language. Then we consider a bigger preorder ΠΠ which includes equations and also conjunctions of equations. Even if we only care about Π it will be apparent that studying invariants in Π gives us useful metatheorems about Π. Equations and conjunctions of equations are examples of formulas (or formulae, according to taste).

If φ is a formula, we can define Y(φ) to be the set of all formulae implied by φ; this agrees with our previous definition. Now define an invariant of φ to be a function I on terms such that

Y(φ)Π={w=wI(w)=I(w)}.

Again, this clearly agrees with our previous definition. Although Y(φ)Π might not be upwards closed in Π, it is upwards closed in Π, which is enough to get the fundamental property of invariants verbatim. This leads to more metatheorems we didn’t have access to before.

Example: associativity and idempotency

Let φ be the conjunction of the associative law and the idempotency law (xx=x). Again, we will rely on our intuition, which says that an invariant I defined by taking I(w) to be the set of all variables appearing in w, works. The corresponding metatheorem is the following

Metatheorem 9.4
#

Let w=w be an equation such that the set of variables appearing in w is equal to the set of variables appearing on w. Then w=w cannot imply an equation that does not have the property from the last sentence.

Example: associativity and commutativity

For a similar example, we can let φ be the conjunction of the associative and the commutative laws. Here we can define I(w) to be the multiset of variables appearing in w. We obtain the following metatheorem.

Metatheorem 9.5
#

Let w=w be an equation such that the variables appearing in w, taking into account multiplicity, are the same ones that appear in w. Then w=w cannot imply an equation that does not have the property from the last sentence.

Trivia: this was the first example of a metatheorem obtained by use of an invariant.

Example: associativity and commutativity with a twist

We can keep expanding our language if it helps us express more intricate invariants. For instance, we can add the symbol ‘1’ to our language. Let φ be the conjunction of associativity, commutativity, the equations 1x=x, and

xxxm times=1,

for some fixed positive integer m. Pause to guess the invariant before we move on.

The invariant I(w) is the multiset of variables appearing in w but multiplicities are computed modulo m. Thus we have the pretty metatheorem:

Metatheorem 9.6
#

Fix some positive integer m. Let w=w be an equation such that every variable appearing in w appears the same number of times in w modulo m. Then w=w cannot imply an equation that does not have the property from the last sentence.

9.3 Proving metatheorems from invariants in Lean

For the rest of this chapter we readopt the convention of calling ‘theorem’ an important result, not necessarily pertaining to specific equations.

An invariant is generally a syntactic property of an expression. However, invariants can also be described and calculated semantically through the notion of a lifting magma family, described below. The general idea is that the value of an invariant for an expression can be computed by substituting specific values for the variables in the expression and evaluating the result in a certain magma in the lifting magma family; additional requirements ensure that the fundamental property of invariants is satisfied.

Definition 9.7 Lifting Magma Family
#

A lifting magma family is a family of magmas {Gα}, one for each type α, satisfying the following properties:

  • For each type α, there is a function ια:αGα.

  • Given a function f:αGα, there is a magma homomorphism liftf:GαGα such that liftf(ια(x))=f(x) for all x in α.

Example 3
#

The free Abelian groups form a lifting magma family. When the underlying set is finite, the groups are isomorphic to Zn.

Example 4
#

Lists form a lifting magma family.

The key consequence of the Definition 9.7 is that it is significantly easier to check whether an equation is satisfied in a lifting magma family.

Theorem 9.8 Evaluation theorem for lifting magma families
#

Suppose E is an equation involving a set of variables X, and let G be a lifting magma family.

Determining whether E is satisfied by GX is equivalent to checking that E is true with the specific substitution ιX.

Proof
Theorem 9.9 The fundamental property of invariants

Let E and E be equations involving a set of variables X, and let G be a lifting magma family.

If E is true with the substitution ιX, and E implies E, then so is E.

Proof
Remark 1
#

The result of evaluating an expression along the function ιX:XGX is the invariant.

In the case of Abelian groups, the result of evaluation is the variables in the expression with multiplicity. In the case of lists, the result of evaluation is the variables in the expression in the order they appear.

When the lifting magma family has good computational properties, calculating the invariant becomes easy.

Remark 2
#

Given an equation ϕ in the language of magmas (possibly involving logical operations other than equality and universal quantification), the initial (i.e., most general) magmas satisfying ϕ (provided they exist) form a lifting magma family.

However, for the purpose of generating invariants, we are interested in lifting magma families with convenient descriptions that are computationally tractable.

Remark 3
#

Suppose S is a finite set of equations in the language of magmas that is a confluent term rewriting system under a certain ordering of the terms (in the sense of the Knuth-Bendix algorithm). Then the initial magmas satisfying S form a lifting magma family where equality of elements in the magma is decidable.

This offers a way of generating examples of lifting magma families with good computational properties for computing invariants of expressions.

9.4 Generating laws from equations

The invariants defined in this chapter are properties of the syntax of the equations being considered. In other words, they are properties of the laws associated with the equations, rather than of the equations themselves. Proving non-implications using invariants requires a way to operate on the syntax of the equations and then translate the reasoning back to results about the original equations.

A magma law can be generated from an equation by accessing the syntax used in its definition and converting it to a declaration representing a magma law through metaprogramming. There is a choice in the variable set of the magma law – one one hand, it can be a finite set whose size matches the number of variables, and on the other hand, it can be the set of natural numbers. The advantage of the former is that one can generate proofs that the satisfiability of the magma law is equivalent to the satisfiability of the original equation (this only needs to be done for variable sets of size up to six, since that is the maximum size currently being considered in the project; it’s convenient to prove individual lemmata for each variable set size establishing this equivalence). The advantage of the latter is that it bypasses the need to cast between various finite sets while constructing a model as a counter-example.

One approach is to generate both forms of the law, using the first to establish the equisatisfiability of the law and the equation and then transporting this result to the second form of the law. The conversion from the first form to the second is summarised in the lemma below.

Lemma 9.10
#

[Compatibility between magma laws over finite sets and the natural numbers] Let E be a magma law defined over n variables and let E~ be the same equation with variables ranging over the natural numbers (formally, E~ is the image of E under the canonical map from the finite set with n elements to the natural numbers). Then any magma M satisfies E if and only if it satisfies E~.

Proof

9.5 Conclusion: Beyond Invariants

We are still lacking:

  • A large collection of invariants.

  • An estimate for how many implications the resulting metatheorems will settle.

  • Algorithms (in Lean, Python, or otherwise) to compute known invariants.

  • General results about lifting magmas.

  • Formalization of the method of invariants and resulting metatheorems.

  • Knowledge about the category-theoretic interpretation of invariants (see the questions in the note for category theorists).

Related to the last bullet point, we note the following. If all that matters about invariants is the fundamental property, we can apply the old French trick of turning a (meta-meta)theorem into a definition.

Q: If we were to define invariants as any functions satisfying the fundamental property, would anything change? (For those who read the note for category theorists: an equivalent redefinition is to consider invariants as functors ΠBool).