Tuesday, July 5, 2011

On Function Purity and Simple Framing

Here is the latest blog post by Bertrand Meyer.

http://bertrandmeyer.com/2011/07/04/if-im-not-pure-at-least-my-functions-are/

It talks about the problem of function purity and broaches the subject of framing. In short, the frame of the specification of a routine is the set of variables (in the broad meaning of the term) that the execution of the routine can modify. In object oriented programming, it is especially tricky because of the sets of dynamically allocated objects of which it might be necessary to alter the state in a given routine. Indeed such set can be taken to be the set of objects reachable from a given root. In a limited solution, such as shown by Meyer in his latest post, you only name variables and expressions that can change. In more elaborate solutions, such as the dynamic frames of Kassios (as quoted by Meyer), it is possible to provide a set of objects and the attributes that can be expected to change. It is especially handy for specifying dynamic data structures like linked lists.

Such a solution seems to present the disadvantage of disallowing the extension of frames through inheritance, which Meyer preserves. I think a compromise can be reached between the two but it is a bit too involved for a post only intended for sharing another blog post.

Tuesday, June 21, 2011

Why I Prefer Predicates to Sets

Various recent comments by others have made me realize that it is about time I write a comparison between sets and predicates.

One comment that springs to mind is a response I got when I said that I almost never use sets, that I prefer predicates. I was asked what was the point to choose since predicates can be represented as sets: a predicate can be seen as a boolean function which in turn can be seen as a functional relation which is basically a set of tuples. More directly, for every predicate you can associate a set which contains exactly the objects which satisfy the predicate. To which I have to add that for each set, there exist the characteristic predicate which is satisfied exactly by the object contained by the set.

What we just established is the semantic equivalence between the two. Such equivalence is less important than it might seem. From a logician's point of view, the equivalence tells us that for each theorem of predicate logic, there should be an equivalent theorem in set theory.

From an engineering and scientific point of view, that is from the point of view of someone who wants to use either of these tools, there is an important difference, one of methodological importance, which, understandably, is irrelevant for logicians.

The difference lies in the flexibility of the related notation. It is hence of syntactic nature. If we want to let the symbols do the work --and I do--, we have to make sure that we can find solutions in the simplest possible way.

One way of doing this is to avoid spurious distinction, i.e. to preserve symmetry as much as we can. This is where predicate calculus [0] is superior to set theory. If we start by encoding set theory into predicate calculus, we will see, little by little, a fundamental difference emerge. In the following, capital letters will stand for sets and their corresponding predicates will be the same letter in lower case. As an example:

	(0)  x ∈ P  ≡  p.x
		for all x

with the dot standing for function application and ≡ for logical equivalence [1]. It relates a set to its characteristic predicate.

Note:
Please notice that the convention of upper cases and lower cases is nicely symmetric. We could have defined characteristic predicates of a set as cp.S or the set of a predicate as set.p but we chose not to in order to preserve the symmetry between the two. Instead of saying that one is the characteristic predicate of the other, we say that they correspond to each other and it does not have to appear everywhere in the formulae where they appear.
(end note)

Notation

Here is a little bit more of syntax before I start the comparison. I take it mostly from [DS90].

If we want to say that a predicate p holds exactly when q holds we can express it as follows:

		〈∀ x:: p.x ≡ q.x〉

It is a bit too verbose for something that is supposed to relate p and q very simply and, for that reason, we introduce a shorthand. Instead of quantifying over x and using x everywhere in the formula, we will have an operator that will say: this predicate holds everywhere:

	 	[p ≡ q]

It is more concise and it is an improvement. It might seem like cheating though unless we know of the notion of pointwise extension of operators. For instance, ≡ applies to boolean operands but we allow ourselves to apply it to compare boolean valued functions too. This is how it then behaves:

	(1)  (p ≡ q).x  ≡  p.x  ≡  q.x
		for all x

We implicitly postulate that all logical connectors apply to boolean-valued functions. If we add one more definition, the previous universal quantification can be calculated to a form with an everywhere operator.

	(2)  [p]  ≡ 〈∀ x:: p.x〉
		 〈∀ x:: p.x ≡ q.x〉
		=   { pointwise extension }
		 〈∀ x:: (p ≡ q).x〉 
		=   { (2) with p := (p ≡ q) }
		  [p ≡ q]

Translating Sets

We can start our comparison by translating sets into predicate formulations. We will concentrate on set equality, subset, intersection, union, subtraction and symmetric difference.

	(3)  P = Q  ≡  [p ≡ q]
	(4)  P ⊆ Q  ≡  [p ⇒ q]
	(5)  x ∈ P ∩ Q  ≡  (p ∧ q).x
	(6)  x ∈ P ∪ Q  ≡  (p ∨ q).x
	(7)  x ∈ P \ Q  ≡  (p ∧ ¬q).x
	(8)  x ∈ P △ Q  ≡  (p ≢ q).x

with △ being the symmetric difference, the set of elements that are present in one set and ≢ being the discrepancy, the negation of ≡. We see that the first operators are translated into a form with one logical connector and the everywhere operator. Also, no operator is built uniquely around either of the implication or the equivalence. Therefore, there is not a unique translations of expressions like the following:

	[p ≡ q ≡ r ≡ s]

To allow us to manipulate the above, we introduce a definition for set equality. It can be calculated from our previous postulates.

	(9)  P = Q  ≡ 〈∀ x:: x ∈ P  ≡  x ∈ Q〉
		  [p ≡ q ≡ r ≡ s]
		=   { double negation }
		  [¬¬(p ≡ q ≡ r ≡ s)]
		=   { ¬ over ≡ twice, see (10) below }
		  [¬(p ≡ q) ≡ ¬(r ≡ s)]
		=   { (11) }
		  [p ≢ q ≡ r ≢ s]
		=   { everywhere expansion (2) }
		 〈∀ x:: (p ≢ q ≡ r ≢ s).x〉
		=   { pointwise extension of ≡ }
		 〈∀ x:: (p ≢ q).x ≡ (r ≢ s).x〉
		=   { (8) twice }
		 〈∀ x:: x ∈ P △ Q ≡ x ∈ R △ S〉
		=   { (9) }
		  P △ Q  =  R △ S
	(10)  [¬(p ≡ q) ≡ ¬p ≡ q]
	(11)  [¬(p ≡ q) ≡ p ≢ q]

Of course, for someone used to it, this is a simple conversion and there would be no need to be this explicit but I include it nonetheless for the sake of those who are not used to it.

Our formula is therefore proven to be equivalent to the one below.

	P △ Q  =  R △ S

we could also introduce the negations elsewhere and get:

	P  =  Q △ R △ S

Since equivalence is associative -- that is: ( (p ≡ q) ≡ r ) ≡ ( p ≡ (q ≡ r) )-- and symmetric --that is: (p ≡ q) ≡ (q ≡ p)--, there are a whole lot of other formulae from set theory that are equivalent to the predicate calculus formulae we showed. All those results create many theorems which basically "mean" the same thing but that one may have to remember anyway. As a comparison, our one predicate can be parsed in various ways and we can use it to substitute p ≡ q for r ≡ s or r for p ≡ q ≡ s to name just two usages of a simple formulae.

One could suggest that the problem lies with the usual reluctance of logicians to talk about sets complement, that if we had an operator, say ⊙, to stand for the complement of △ --which can be seen as problematic because it creates a set with elements taken from neither of its operands--, we wouldn't need to introduce spurious negations. The above formula could then be translated in:

	P ⊙ Q  =  R ⊙ S 

It remains to choose which equivalence becomes an equality and which ones become the complement of a symmetric difference. I believe the biggest problem to be that set equality mangles two orthogonal notions together: boolean equality and (implicit) universal quantification. The same goes for implication.

In short, this is one example where set theory destroys a beautiful symmetry and introduces spurious distinctions that pollute one's reasoning rather than help it. Whenever I have problems to solve and that calculations can help me, in the vast majority of cases, I use predicate calculus rather than set theory.

Simon Hudon
April 10th, 2011
Meilen

Notes

[0] I use the calculus, not the logic. The difference is that the logic deduces theorems from other theorems whereas the calculus simply helps us create relations between formulae and find new formulae from old ones whether they are theorems or not.

[1] Like many others, I have forsaken the double arrow for that purpose. The choice can be traced back to [DS90] and [AvG90].

References

[AvG90] Antonetta J. M. Gasteren. 1990. On the Shape of
       Mathematical Arguments. Springer-Verlag
       New York, Inc., New York, NY, USA.

[DS90] Edsger W. Dijkstra and Carel S. Scholten. 1990. Predicate
       Calculus and Program Semantics. Springer-Verlag New
       York, Inc., New York, NY, USA.

Tuesday, June 14, 2011

"Lifting weights with your brain"

I am posting the following just for the sake of sharing a quote I liked in an essay I'm reading:

In workouts a football player may bench press 300 pounds, even though he may never have to exert anything like that much force in the course of a game. Likewise, if your professors try to make you learn stuff that's more advanced than you'll need in a job, it may not just be because they're academics, detached from the real world. They may be trying to make you lift weights with your brain.

By Paul Graham, Undergraduation. I don't suppose it will completely eradicate the comment "What good does it do me to learn [put a beautiful piece of theory, technology, etc here]? People don't use it." Another reply of course might be to point out that "maybe they should".

Simon Hudon
June 14th, 2011
Meilen

Monday, June 6, 2011

Decidability

The proof of the undecidability of the halting problem has always fascinated me. It is a formidable result to behold, at least the first time around. It relies on the construction of a program that cannot exist because of a contradiction drawn through self-reference. Here is an interesting rendering of the argument by Christopher Strachey: An Impossible Program.

However, an interesting paper written by Eric Hehner (Problems with the Halting Problem) sheds doubt on the correctness of the usual proof. He proposes that the contradiction in the proof is due to self reference rather than because of the structure of the halting problem. It is a good read and I strongly recommend it.

Simon Hudon
Meilen
June 6th 2011

Monday, May 9, 2011

A Lecture Supported by an Automated Theorem Prover

UPDATED: added note [1] on mutual implication

Last week on monday, I attended the guest lecture that Prof. Topias Nipkow from the Technical University of Munich (TUM) gave at ETH. I had heard of him before because of his work on the Isabelle theorem prover. An interesting thing before I start: he refers to it as a proof assistant.

I didn't expect much of this talk because of my reluctance to rely on automatic provers to do my job and because I expected a very technical talk about proof strategies implemented or to be implemented in Isabelle. Since such issues need concern only people involved in the construction of theorem provers and that I'm not, I didn't think it would appeal to me.

I was pleasantly surprised that he had chosen to talk about a course he had recently started to give on the topic of semantics. Said course was noteworthy because formal methods and an automated prover were used as teaching vehicles. I am pleased to see that he decided to use formal methods as a tool for understanding the topic and that he would opt for spending more time on semantics, the subject matter, and less on logic and formal methods.

It is my considered opinion, however, that a prerequisite course on the design of formal proofs would be most useful, even necessary. I draw this conclusion because I believe that acquiring effective techniques for designing elegant proofs can help tremendously in making hard problems easier. But, since he didn't want to teach logic, he gave very little attention to the design of proofs and adopted "validity of the proofs" as the goal the students' had to reach in their assignments. It was asked --by someone else before I could ask it-- what importance he gave to the style of the submitted proofs. He said none because some (in his opinion) very good students had an awful style and he didn't want to penalize them for it. He considered that, having submitted a correct proof, the student had shown that he understood. I would say that, in doing so, he was doing his students a disservice. However, I can now better understand his position because the underlying assumption is very popular: since words like style and elegance have a very strong esthetic connotation, it becomes automatically whimsical to judge them or to encourage the students to improve upon them. After all, we're here to do science, not to make dresses!

*                         *
*

Even when it has been successfully argued just how useful mathematical elegance can be, people keep opposing its use as a yardstick on the ground that it is too subjective and enforcing one standard would be arbitrary.

It turns out that, not only are elegance and style very useful, but they can be very objectively analyzed and criticized. It has been the preferred subject of Dijkstra in the second part of his career and it appears almost everywhere in his writing --see EWD619 Essays on the nature and role of mathematical elegance for such an exposition [0]--.

The usefulness of an elegant style come from the fact that it yields simple and clear proofs requiring very little mental work from the reader (which is not something to be underestimated) but, more importantly, it allows the writer of a proof to be economical in the use of his reasoning abilities. This can make the difference between a problem which is impossible to solve and one which is easily solved. On the other hand, in such a course as what Nipkow has designed, the most interesting things for students to learn are not the individual theorems that they proved but the techniques that they used to find a proof. If the techniques are not taught, it can be expected that the skills the students acquire will be of much less use when confronted to different problems.

What an elegant style boils down to is, to put it in Dijkstra's words, the avoidance complexity generators and the separation of one's concerns. He also argued that concision is an effective yardstick to evaluate the application of those techniques. Indeed, concision is effectively achieved by both separating one's concerns and avoiding complexity generators. The most well known complexity generators are case analyses, proofs of logical equivalence by mutual implication and inappropriate naming. The first two are legitimate proof techniques but their application double the burden of the proof. It doesn't mean that they should never be used but rather that they should be avoided unless the proofs of the many required lemmata differ significantly. [1] I say significantly to stress that, in some cases, they can differ in small ways and, upon closer scrutiny, the differences can be abstracted from.

The issue of naming is also closely related to the separation of one's concerns but, being a tricky issue, I would rather point the reader in the direction of van Gasteren's book "On the Shape of Mathematical Arguments" to the chapter 15 "On Naming" which covers the subject very nicely. Since Dijkstra was van Gasteren's supervisor, he wrote the chapter with her and it has earned it an EWD number: 958 [0]. This allows me to skip directly to the matter of separation of concerns.

*                         *
*

While debating about concision and about Nipkow's lecture with my supervisor, something kept coming up in his arguments. I was favoring short formal proofs and he kept asking: "What if a student doesn't want to call 'auto' [the function of Isabelle which can take care of the details of a proof step] but wants to go into the details to understand them?" First of all, I have to point out that being the input for an automated tool doesn't relieve a proof from the obligation of being clear. [2] Unlike Nipkow, I see that the proof must include intermediate formulae accompanied with a hint of how one proceeds to find them. This would correspond to the combination of what he calls a proof script --a series of hint without intermediate formulae-- and a structured proof --a series of intermediate formulae with little to no hints; this is what he prefers to use--. In that respect, 'auto' is no better than the hint "trivially, one sees ...". The choice of how early one uses 'auto' is basically related to decomposition. It is unrelated to the peculiarities of the prover but relies on how clear (to a human reader) and concise the proof is without the details. What my supervisor and Nipkow name "using auto later in the proof" would be, in a language independent of the use of an automated prover, including more details. If the proof is clear and concise without those details, they don't belong in the body of the proof. It is that simple. One could, however, include them in the proof of a lemma invoked in one (or many) proof steps of the main proof.

By Including a lemma, one doesn't destroy concision because the proof of said lemma can be included beside that of the main theorem rather than intertwined with it. The difference is that, while one reads a proof, each step must clearly take him closer to the goal. Any digression should be postponed so as not to distract the attention from the goal. One good indication that a lemma is needed is when many steps of a proof are concerned with a different subject than the goal. For instance, when proving a theorem in parsing theory, if many successive steps are concerned with predicate calculus, the attention is taken away from parsing. Instead, making the whole predicate calculation in one step and labeling it "predicate calculus" is very judicious. Nothing prevents the proof that pertains to predicate calculus to be presented later on, especially if it is not an easy proof.

The important point here is that, sticking to the subject at hand doesn't mean forgetting that there are other problems that need one's attention. It means dealing with one problem at a time each time forgetting momentarily what the other problems are. This is exactly what modularity is about.

Furthermore, with an automatic prover, nothing prevent someone to use 'auto' to commit the sin of omission; details that would make a step clear are then missing. It is then a matter of style to judge how much should be added. This reinforces my point that good style should be taught because clarity is the primary goal with proofs [3].

With respect to such a tool, I would welcome the sight of one where keywords like 'auto' are replaced by catchwords like 'predicate calculus' to hint at the existence of a simple proof --at most five steps-- in predicate calculus --in this case-- that supports the designated step. More often, we could use invocations like 'theorem 7' (or '(7)' for short) or 'persistence rule' as a way of invoking a very straightforward application of a referenced theorem. It is clear to the reader what is going on then and to the prover, the problem is very simple: it looks for a simple proof. If no proofs of at most five steps exist, the search fails. More importantly: the user should have foreseen it. The prover should never be used to sweep problems under the carpet.

Like with the type system of a programming language, it should be easy for the human reader to see that a given proof step is going to be accepted. It is by being predictable that automatic tools are useful, not by working magic.

*                         *
*

By way of conclusion, I go on to another aspects of Nipkow's talk. He said that applying formal proofs to the teaching of computer science is especially useful in those subjects where the formalization is close to people's "intuition". In the rest of the subjects, it is a bad idea. I say this is a drawback of his approach, not one of formal proofs. If you take formal proofs as a proper input format for the mechanized treatment of intuitive arguments, it seems inevitable to run into that problem. However, formalism can be used for purposes which have nothing to do with mechanization: the purposes of expressing and understanding precise and general statements. If it is used in this capacity, formalism allows a (human) reasoner to take shortcuts which have no counterpart in intuitive arguments. It is one of the strengths of the properly designed formalisms that you can use them to attain results which is beyond intuitive reasoning.

Case in point, the topics where Nipkow says formal proofs are not an appropriate vehicle for teaching are those where it would be crucial to rely on formalisms that are not merely the translation of intuitive reasoning. To use those, we would have to stop relying on our intuition and acquire the techniques of effective formal reasoning. This leads me back to my first point. 'Effective' means that we're not interested in finding just about any proof: we are looking for a simple and elegant one so that problems for which the solution would be beyond our abilities could admit a simple solution.

In other words, striving for simplicity is not a purist's concern as much as a pragmatic preoccupation that allows us to solve with as little efforts as possible problems that are beyond the unaided minds.

Simon Hudon
ETH Zurich
Mai 11th 2011

[0] For the EWD texts, see http://www.cs.utexas.edu/users/EWD/

[1] For those who are used to using deductive systems for formal reasoning, the question "what alternative do we have to mutual implication?" might have come up. The answer is that equivalence is a special case of equality and should be treated as such. That is to say that its most straightforward use is by "substituting equals for equals" also known as "Leibniz's rule". It is also the most straightforward way to prove an equivalence. The naming of equivalence by "bidirectional implication" is a horrible deformation. It is analogous to calling equality in numbers "bidirectional inequality": it hints at a way of proving it using implication but does not distinguishes between this shape of one possible proof and the theorem and, indeed, I realize that some people immediately think of mutual implication when they see an equivalence. It's a shame.

[2] In this respect, theorem provers seem to be more primitive than our modern programming languages. Whereas they become more and more independent of their implementation to embody abstractions (for instance, in Java, there no longer is a "register" keyword for variables), the proof languages of automated provers shamelessly include a lot of specific command for what I shall call "sweeping the rest of the problem under the rug". In proofs explicitly constructed for human readers, those would be replaced by vague expressions like "well you know..." followed, if presented in a talk, by a waving of the hands intended to say "anyone but idiots will get this".

[3] This goes against what seems like a school of thought that views formal proofs as the input of tools like theorem provers. It is easy for people of that school to draw the fallacious analogy with assembler programming. The important difference is that a proof designed with style can be the vehicle of one's understanding. Since mechanisms like abstraction are routinely applied to make proofs as simple as they can be, if one wants to understand an obscure and counter intuitive theorem, an elegant formal proof is very likely to be the best way to do it. On the other hand, assembler is not a language which helps one understand an algorithm. It is clearly an input format of a microprocessor and people should treat it as such.

Special thanks to my brother Francois for applying his surgical intellect to the dissection of the objections made against the notions of elegance and style and to Olivier who also help me improve this text.

Wednesday, March 23, 2011

Done in One Step

This is a short note to relate an anecdote that just happened to me. I'm learning Rutger M. Dijkstra's computation calculus by reading his paper with the same name and I come across a very short proof.

The objects of the calculus are computations taken as predicates over state traces. The result looks like a relational calculus in that in never names the computations it is dealing with. It adds • ; ~ to the usual predicate calculus. • takes a state predicate `p', this is a predicate over traces that is satisfied only by traces of one state, and makes it a predicate satisfied by traces starting with a state satisfying `p'. It is captured by definition (13).

(13)	•p = p;true

where ; is the sequential composition of computation. `a;b' can be interpreted as constructing a computation satisfied by traces produced by executing `a' first then `b'. For the proof, all we need is the state restriction property. It applies to state predicate `p' and arbitrary computation `s'.

(8)	p;s  =  p;true  ∧  s

Finally, ~ is a negation of a state predicate that also yields a state predicates. That is to say that computations form a predicate algebra and so do state predicates, using ~ instead of ¬. And now, here is the proof. Don't spend to much time reading it, the second step is far from obvious.

	  [ •~p ⇒ ¬•p ]
	=   { Shunting, definition (13) (see below) }
	  [ p;true ∧ ~p;true  ⇒  false ]
	=   { state restriction (8) thrice }
	  [ (p ∧ ~p);true  ⇒  false ]
	=   { Predicate calculus }
	  [ false;true  ⇒  false ]
	=   { false is left zero of ; }
	  true

I decided to see for myself how the second step works and my calculation took five steps instead of one. After struggling with it for a minute or so in my head, I decided to make appeals to associativity explicit in order to consider it as a manipulation opportunity. One last thing: the weaker state predicate is 1, it plays the role of true in the related predicate algebra and we can express that any predicate `p' is a state predicate by making it stronger than 1: [ p ⇒ 1 ]. In my calculation, ~ doesn't pay a role so I replace `~p' by `q'. Therefore, we are trying to transform p;true ∧ q;true into (p ∧ q);true using equality preserving steps.

	  p;true ∧ q;true
	=   { (8) using p is a state predicate }
	  p;(q;true)
	=   { At this point, all other application 
	      of (8) would take us back a step or
	      would lead nowhere, let's use 
	      associativity }
	  (p;q);true
	=   { Now we can recognize the left hand 
	      side of (8) so we apply it using that p
	      is a state predicate again }
	  (p;true ∧ q);true
	=   { We can now take advantage of the
	      fact that q is a state predicate for the
	      first time with [ q ⇒ 1 ] or, rather
	      [ q  ≡  q ∧ 1 ] }
	  (p;true ∧ 1 ∧ q);true
	=   { Now p;true ∧ 1 matches the right
	      hand side of (8) and we can use it
	      to eliminate true and we can 
	      eliminate 1 because it is the identity
	      of ; }
	  (p ∧ q);true

There are mostly two things that helped design the above computation.

  1. We kept track of the properties of that were left unused through the calculation. It especially help to move forward

  2. Realizing that some intermediate formulae would contain a series of consecutive ; and that we might get to manipulate different groupings lead us to make the appeals to associativity explicit and made apparent the best way to use (8) for the second time.

Although I'm happy to use the opportunity to practice various techniques for designing formal proofs, the writing of the above was prompted by a (mild) dissatisfaction at Dijkstra's proof for leaving out most of the helpful and relevant bits.

Simon Hudon
March 23rd, 2011
ETH Zurich

Wednesday, March 9, 2011

Calculating with Partitions

In the book of Chandy & Misra [CM88], I found an interesting consequence of three variables of which exactly one is true. To begin with, such a relation can be formalized as follows:

(0)	[ x ∨ y ∨ z ]
(1)	[ ¬(x ∧ y) ]
(2)	[ ¬(x ∧ z) ]
(3)	[ ¬(y ∧ z) ]

Note

Before proceeding with the interesting fact, let's have a closer look at the square brackets. They come from the book of Dijkstra & Scholten and they constitute a form of implicit universal quantification. If x, y and z are normal boolean variables, it can be omitted. However if they are predicates ([DS90] introduces them as boolean structures), you can leave out their arguments and it is considered that they are universally quantified over. In such a case,

	[ x ∨ y ∨ z ]

would be equivalent to

	〈∀ i:: (x ∨ y ∨ z).i〉 

and predicate application is considered to distribute over the logical connectives so it is, in turn equivalent to

	〈∀ i:: x.i ∨ y.i ∨ z.i〉 

Since i doesn't play a role in our manipulations, it simplifies the formula and our work to leave it out.

This means that, if x, y and z are the characteristic predicate of sets X, Y, Z, our set of constraint states that they partition the type over which they range.

(end of note)

Now the interesting consequence is this one:

(4)	[ ¬x  ≡  y ∨ z ]

Its proof is nice and simple. The usual heuristics to design a proof of such a theorem is either to manipulate the whole formula or to transform one side into the other. Since we don't have rules to relate either of y or z to ¬x, we choose to manipulate the whole formula. Also, we need to join ¬x and y ∨ z with either a disjunction or a conjunction because of the shape of the rules that we have. The golden rule, stated below, is a good candidate to achieve just that.

(5)	[ x ∨ y ≡ x ≡ y ≡ x ∧ y ]	("Golden rule")

For those unfamiliar with the rule, they can convince themselves of the validity or the rule by seeing that x ⇒ y can be equivalently formulated as follows:

	x  ≡  x ∧ y
	x ∨ y  ≡  y

By transitivity, they are equivalent. Hence, the golden rule. Now for the proof of (4):

	  ¬x  ≡  y ∨ z
	=   { ¬ over ≡ to get ¬ out of the way }
	  ¬(x  ≡  y ∨ z)
	=   { Golden rule }
	  ¬(x ∧ (y ∨ z)  ≡  x ∨ y ∨ z)
	=   { (0) }
	  ¬(x ∧ (y ∨ z))
	=   { ∧ over ∨.  It acheives the
	      grouping of (1) to (3) }
	  ¬( (x ∧ y) ∨ (x ∧ z) )
	=   { ¬ over ∨ }
	  ¬(x ∧ y) ∧ ¬(x ∧ z)
	=   { (1) and (2) }
	  true

This simple proof is six step and it carries an inactive negation around needlessly. We could remove one step and the noise of the negation if we simply realize that

	¬(x  ≡  y)  ≡  x  ≢  y

Also, we can use the golden rule over ≢ if we realize that replacing an even number of ≡ by ≢ in a sequence of equivalences leaves the truth value unchanged.

	  x  ≢  y ∨ z
	=   { Golden rule }
	  x ∧ (y ∨ z)  ≢  x ∨ y ∨ z
	=   { (0) }
	  ¬(x ∧ (y ∨ z))
	=   { ∧ over ∨ }
	  ¬((x ∧ y) ∨ (x ∧ z))
	=   { ¬ over ∨ }
	  ¬(x ∧ y) ∧ ¬(x ∧ z)
	=   { (1) and (2) }
	  true

It is nicer this way: the proof is shorter and ¬ is inactive only for one step. We can suppose that it can be generalized to more variables. Indeed, (0) to (3) can be generalized by (6) and (7) and we can replace x, y, z by x.i for any appropriate i.

(6) [〈∃ i:: x.i〉 ]
(7) [〈∀ i,j: i ≠ j: ¬(x.i ∧ x.j)〉 ]

A first attempt at generalizing (4) might yield

(4')	[ x.j  ≢ 〈∃ i: i ≠ j: x.i〉 ]

but we can do better. We don't need to only have on variable on the left hand side of ≢, we could have many, like on the right hand side. The important thing is that variables appear on exactly one side. To formalize this, we can use an arbitrary separating predicate P. The x.i such that P.i holds will be on the left and the others will be on the right.

(4'')	[〈∃ i: P.i: x.i〉 ≢〈∃ i: ¬P.i: x.i〉 ]

Notice how nicely symmetric (4'') is. The same technique can be used to prove it as what we used for (4).

	 〈∃ i: P.i: x.i〉 ≢〈∃ i: ¬P.i: x.i〉 
	=   { Golden rule }
	   〈∃ i: P.i: x.i〉 ∧〈∃ i: ¬P.i: x.i〉 
	  ≢〈∃ i: P.i: x.i〉 ∨〈∃ i: ¬P.i: x.i〉 
	=   { ∧ over ∃ and nesting; merge ranges }
	 〈∃ i,j: P.i ∧ ¬P.j: x.i ∧ x.j〉 ≢〈∃ i:: x.i〉 
	=   { (6) }
	  ¬〈∃ i,j: P.i ∧ ¬P.j: x.i ∧ x.j〉 
	=   { ¬ over ∃ }
	 〈∀ i,j: P.i ∧ ¬P.j: ¬(x.i ∧ x.j)〉 
	⇐  { P.i ∧ ¬P.j ⇒ P.i ≠ P.j and the 
	      contrapositive of Leibniz's
	      principle applies }
	 〈∀ i,j: i ≠ j: ¬(x.i ∧ x.j)〉 
	=   { (7) }
	  true

Our next investigation is whether or not (4'') is sufficient to characterize the kind of partition defined by (6) and (7). To do so, we prove that (4'') ≡ (6) ∧ (7). Mutual implication is an intesting choice of strategy since half of the work is already done and we only have a weaker proof obligation to take care of. We can prove (6) and (7) separately under the assumption of (4'')

Proof of (6)

	 〈∃ i:: x.i〉 
	≠   { (4'') with P.i ≡ true }
	 〈∃ i: false: x.i〉 
	=   { Empty range }
	  false

For the proof of (7), we have to notice the formal differences between (7) and one of the sides of (4''). First of all, (7) has more dummies and has a universal quantification instead of an existential one. Second, the term of the quantification has two (negated) occurrences of x rather than just one positive one. What we must do is split the dummies between two quantifications --nesting should do nicely--, then move the second x term out of the inner quantification and use one of the negations to change the ∀ into ∃. The formula should then be amenable to an application of (4'').

Proof of (7)

	 〈∀ i,j: i ≠ j: ¬(x.i ∧ x.j)〉 
	=   { Nesting to separate the dummies }
	 〈∀ i::〈∀ j: i ≠ j: ¬(x.i ∧ x.j)〉 〉 
	=   { ¬ over ∧ then ∨ over ∀ to keep only one
	      x term inside the inner quantification }
	 〈∀ i::〈∀ j: i ≠ j: ¬x.j〉  ∨ ¬x.i〉  
	=   { ¬ over ∃ }
	 〈∀ i:: ¬〈∃ j: i ≠ j: x.j〉  ∨ ¬x.i〉  
	=   { Now we have the shape we were looking for
	      we can use (4'') with P.i  ≡  i = j }
	 〈∀ i::〈∃ j: i = j: x.j〉  ∨ ¬x.i〉  
	=   { One point rule }
	 〈∀ i:: x.i ∨ ¬x.i〉  
	=   { Excluded middle and identity of ∀ }
	  true

Luckily enough, after the application of (4''), the steps are of the kind "there is only one thing to do". Without context, the rest of the proof could look like a giant rabbit but, since (4'') is the main part of the rules that we can apply, it is reasonable to expect that we should create an opportunity to apply it and each step before its application bridges the formal difference between the two.

The above proof has been presented as a nice example that an elegant proof can be designed using only the properties of the formulae rather than their interpretation and to present some more advanced heuristics to do so.

The investigation has been prompted by my delight at seeing how nicely the first parts of the problem were solved using the golden rule but the decision to show the proof was taken because of my surprise at how easily the last proof came to me with a simple analysis of the formal differences between the two main formulae although I had expected it to keep me busy for a long time.

Simon Hudon
March 8th, 2011
Meilen

References

[CM88] K. Mani Chandy and Jayadev Misra, Parallel Program Design:
       A Foundation, Addison-Wesley, 1988

[DS90] Edsger W. Dijkstra and Carel S. Scholten, Predicate Calculus
       and Program Semantics, Texts and Monograph in Computer
       Science, Springer Verlag New York Inc., 1990