Monday, July 7, 2014

Where have I been?

It has been a while since my last blog post. I have a collection of posts that I have been meaning to posts but that were somehow never good enough. I'm going to try harder to publish them. In the mean time, here is a snapshot of what has been going on with me in the last two years... at least scientifically.

I underwent a peculiar transformation since I moved from Zürich to Toronto. I used to think that you can study a software system purely from a logical point of view and, when you proved all you needed to prove, all is left to do is run the software. I have done it for some non-trivial software system but I realize that, for a bigger piece of software, the amount of work to cover the whole thing would be excessive both in the sense of "exceeding necessity" and "exceeding ability to do so". I'm reworking my vision of software design as an activity that includes proving theorem but which also comprises other techniques of validation. For example, random testing as done by QuickCheck in Haskell is often a good compromise. The key is to choose which technique applies best.

For instance, I started developing a program one year ago and it has now become relatively big (considering that I'm the only person developing it) and, to my surprise, it works well even though I didn't prove a single property about it. In the back of my mind however, I keep pulling some parts of the design aside and figuring out the logic that would make it possible to prove that it is correct. One of these days, I'll prove the correctness of some crucial parts of my programs and I think that will be both really interesting and really useful -- especially considering that the program is a formal verifier.

On the other hand, I am also discovering the interplay between mastering the theory and having good software tools for designing software. I used to think that if you know the theory well, you have no need for tools. I'm realizing that this is a pretty silly point of view: the program that I mentioned above is specialized in writing proofs for simple conjectures. It very incredibly fast. In one of the small systems that I use that tool for, there are around 500 conjectures that need to be proved. Most of them are incredibly easy to prove but they still need to be proved. My tool takes about 30s to prove 499 of them leaving me with the truly difficult theorem. That's incredibly helpful because now, I can focus my attention on the hardest of the hard design problems. The best part is, compared to an interactive prover, I do not need to manually invoke the prover on each proof obligation. They will all be attempted automatically except for those with a proof provided by the user.

The tool is designed to allow me to verify Unit-B models. I don't think I would be able to use Unit-B as confidently as I do and as efficiently as I do without it. Even better, when using Unit-B to develop the very first examples, every time I was faced with a problem, I could improvise a solution and think really hard and figure out that it was valid. This is very inconvenient to teach the method because I hardly have a list of all the tricks that I use. By implementing a verifier, I need to identify the truly useful techniques and make sure that they can be used with confidence and have a clear explanation to disallow whatever I choose to exclude. In other words, I'm now forced to formulate the rules of the method independently of any one example.

So much for simplistic views of the world.

Tuesday, February 28, 2012

My Master Thesis

For a long time, I have wanted to talk about discrete transition systems as the main abstraction of program execution. People use such abstractions for two reasons: producing theoretical results and reasoning about their programs. Although I don't have problems with the first one (I don't quarrel with people looking for theoretical results, I think it's a valid goal) I am looking to reason about my programs to document them with a proof of their correctness.

I always had a hard time to choose which part to start with and I think I got something nicely simple to talk of Event-B, UNITY and, my own creation, Unit-B. It focuses on the achievements of my master thesis but it should already make sense for anyone who is not my supervisor and who does not have any experience with any of the mentioned formal methods.

The goal of my master thesis was to design a formal method for developing correct parallel programs. The result is Unit-B, a formal method inspired Event-B and UNITY.

Using Unit-B, one can design a concurrent program by refining, step by step, a specification into a program in a similar fashion as in Event-B. However, while refinement in Event-B only preserves safety properties, refinement in Unit-B also preserves liveness properties. This means that, once it is established that a specification S satisfies a progress property P, all specifications and programs refining S will also satisfy P. It ensues that, as opposed to Event-B, it is possible in Unit-B to introduce progress properties in a specification along the refinement process exactly at the time where they make most sense rather than at the very end of a development.

The technique has two benefits. First, in Event-B, when one tries to establish a liveness property, it is often too late: while trying to make the system safe, one has removed any possibilities for the system to perform a useful function. Such is not the case with Unit-B. By tackling progress properties before the related safety properties, we eliminate the possibility of addressing safety concerns by being too conservative.

Second, by putting the requirement for progress as the center of attention in the process of refinement, the specification ends up shaping itself around the progress property. In other words, the concern for progress dictates the skeleton of a valid program. After that, the concern for safety can be seen as putting the flesh on the bones.

Compared to UNITY, this approach has the benefit of making the transition from specification to program smoother. Since a program is just a special kind of specification in Unit-B, the suitable application of refinement to specification will slowly give birth to a program. In UNITY, a specification and a program are two completely different things making the refinement of a specification and the creation of a program two distinct activities.

From UNITY, Unit-B takes the temporal logic and the fairness assumption in the execution of its transition system. Fairness facilitates the mapping between progress properties to transition systems and the temporal logic is tailored to express simply properties that can be mapped to programs.

The semantics of Unit-B is defined using the R.M. Dijkstra's computation calculus which allowed me to be very fine-grained in the choice of and in the proof of correctness of the refinement rules.

Cheers,
Simon Hudon
York U
Toronto, Canada
February 28th 2012

Tuesday, July 5, 2011

On Function Purity and Simple Framing

Here is the latest blog post by Bertrand Meyer.

http://bertrandmeyer.com/2011/07/04/if-im-not-pure-at-least-my-functions-are/

It talks about the problem of function purity and broaches the subject of framing. In short, the frame of the specification of a routine is the set of variables (in the broad meaning of the term) that the execution of the routine can modify. In object oriented programming, it is especially tricky because of the sets of dynamically allocated objects of which it might be necessary to alter the state in a given routine. Indeed such set can be taken to be the set of objects reachable from a given root. In a limited solution, such as shown by Meyer in his latest post, you only name variables and expressions that can change. In more elaborate solutions, such as the dynamic frames of Kassios (as quoted by Meyer), it is possible to provide a set of objects and the attributes that can be expected to change. It is especially handy for specifying dynamic data structures like linked lists.

Such a solution seems to present the disadvantage of disallowing the extension of frames through inheritance, which Meyer preserves. I think a compromise can be reached between the two but it is a bit too involved for a post only intended for sharing another blog post.

Tuesday, June 21, 2011

Why I Prefer Predicates to Sets

Various recent comments by others have made me realize that it is about time I write a comparison between sets and predicates.

One comment that springs to mind is a response I got when I said that I almost never use sets, that I prefer predicates. I was asked what was the point to choose since predicates can be represented as sets: a predicate can be seen as a boolean function which in turn can be seen as a functional relation which is basically a set of tuples. More directly, for every predicate you can associate a set which contains exactly the objects which satisfy the predicate. To which I have to add that for each set, there exist the characteristic predicate which is satisfied exactly by the object contained by the set.

What we just established is the semantic equivalence between the two. Such equivalence is less important than it might seem. From a logician's point of view, the equivalence tells us that for each theorem of predicate logic, there should be an equivalent theorem in set theory.

From an engineering and scientific point of view, that is from the point of view of someone who wants to use either of these tools, there is an important difference, one of methodological importance, which, understandably, is irrelevant for logicians.

The difference lies in the flexibility of the related notation. It is hence of syntactic nature. If we want to let the symbols do the work --and I do--, we have to make sure that we can find solutions in the simplest possible way.

One way of doing this is to avoid spurious distinction, i.e. to preserve symmetry as much as we can. This is where predicate calculus [0] is superior to set theory. If we start by encoding set theory into predicate calculus, we will see, little by little, a fundamental difference emerge. In the following, capital letters will stand for sets and their corresponding predicates will be the same letter in lower case. As an example:

	(0)  x ∈ P  ≡  p.x
		for all x

with the dot standing for function application and ≡ for logical equivalence [1]. It relates a set to its characteristic predicate.

Note:
Please notice that the convention of upper cases and lower cases is nicely symmetric. We could have defined characteristic predicates of a set as cp.S or the set of a predicate as set.p but we chose not to in order to preserve the symmetry between the two. Instead of saying that one is the characteristic predicate of the other, we say that they correspond to each other and it does not have to appear everywhere in the formulae where they appear.
(end note)

Notation

Here is a little bit more of syntax before I start the comparison. I take it mostly from [DS90].

If we want to say that a predicate p holds exactly when q holds we can express it as follows:

		〈∀ x:: p.x ≡ q.x〉

It is a bit too verbose for something that is supposed to relate p and q very simply and, for that reason, we introduce a shorthand. Instead of quantifying over x and using x everywhere in the formula, we will have an operator that will say: this predicate holds everywhere:

	 	[p ≡ q]

It is more concise and it is an improvement. It might seem like cheating though unless we know of the notion of pointwise extension of operators. For instance, ≡ applies to boolean operands but we allow ourselves to apply it to compare boolean valued functions too. This is how it then behaves:

	(1)  (p ≡ q).x  ≡  p.x  ≡  q.x
		for all x

We implicitly postulate that all logical connectors apply to boolean-valued functions. If we add one more definition, the previous universal quantification can be calculated to a form with an everywhere operator.

	(2)  [p]  ≡ 〈∀ x:: p.x〉
		 〈∀ x:: p.x ≡ q.x〉
		=   { pointwise extension }
		 〈∀ x:: (p ≡ q).x〉 
		=   { (2) with p := (p ≡ q) }
		  [p ≡ q]

Translating Sets

We can start our comparison by translating sets into predicate formulations. We will concentrate on set equality, subset, intersection, union, subtraction and symmetric difference.

	(3)  P = Q  ≡  [p ≡ q]
	(4)  P ⊆ Q  ≡  [p ⇒ q]
	(5)  x ∈ P ∩ Q  ≡  (p ∧ q).x
	(6)  x ∈ P ∪ Q  ≡  (p ∨ q).x
	(7)  x ∈ P \ Q  ≡  (p ∧ ¬q).x
	(8)  x ∈ P △ Q  ≡  (p ≢ q).x

with △ being the symmetric difference, the set of elements that are present in one set and ≢ being the discrepancy, the negation of ≡. We see that the first operators are translated into a form with one logical connector and the everywhere operator. Also, no operator is built uniquely around either of the implication or the equivalence. Therefore, there is not a unique translations of expressions like the following:

	[p ≡ q ≡ r ≡ s]

To allow us to manipulate the above, we introduce a definition for set equality. It can be calculated from our previous postulates.

	(9)  P = Q  ≡ 〈∀ x:: x ∈ P  ≡  x ∈ Q〉
		  [p ≡ q ≡ r ≡ s]
		=   { double negation }
		  [¬¬(p ≡ q ≡ r ≡ s)]
		=   { ¬ over ≡ twice, see (10) below }
		  [¬(p ≡ q) ≡ ¬(r ≡ s)]
		=   { (11) }
		  [p ≢ q ≡ r ≢ s]
		=   { everywhere expansion (2) }
		 〈∀ x:: (p ≢ q ≡ r ≢ s).x〉
		=   { pointwise extension of ≡ }
		 〈∀ x:: (p ≢ q).x ≡ (r ≢ s).x〉
		=   { (8) twice }
		 〈∀ x:: x ∈ P △ Q ≡ x ∈ R △ S〉
		=   { (9) }
		  P △ Q  =  R △ S
	(10)  [¬(p ≡ q) ≡ ¬p ≡ q]
	(11)  [¬(p ≡ q) ≡ p ≢ q]

Of course, for someone used to it, this is a simple conversion and there would be no need to be this explicit but I include it nonetheless for the sake of those who are not used to it.

Our formula is therefore proven to be equivalent to the one below.

	P △ Q  =  R △ S

we could also introduce the negations elsewhere and get:

	P  =  Q △ R △ S

Since equivalence is associative -- that is: ( (p ≡ q) ≡ r ) ≡ ( p ≡ (q ≡ r) )-- and symmetric --that is: (p ≡ q) ≡ (q ≡ p)--, there are a whole lot of other formulae from set theory that are equivalent to the predicate calculus formulae we showed. All those results create many theorems which basically "mean" the same thing but that one may have to remember anyway. As a comparison, our one predicate can be parsed in various ways and we can use it to substitute p ≡ q for r ≡ s or r for p ≡ q ≡ s to name just two usages of a simple formulae.

One could suggest that the problem lies with the usual reluctance of logicians to talk about sets complement, that if we had an operator, say ⊙, to stand for the complement of △ --which can be seen as problematic because it creates a set with elements taken from neither of its operands--, we wouldn't need to introduce spurious negations. The above formula could then be translated in:

	P ⊙ Q  =  R ⊙ S 

It remains to choose which equivalence becomes an equality and which ones become the complement of a symmetric difference. I believe the biggest problem to be that set equality mangles two orthogonal notions together: boolean equality and (implicit) universal quantification. The same goes for implication.

In short, this is one example where set theory destroys a beautiful symmetry and introduces spurious distinctions that pollute one's reasoning rather than help it. Whenever I have problems to solve and that calculations can help me, in the vast majority of cases, I use predicate calculus rather than set theory.

Simon Hudon
April 10th, 2011
Meilen

Notes

[0] I use the calculus, not the logic. The difference is that the logic deduces theorems from other theorems whereas the calculus simply helps us create relations between formulae and find new formulae from old ones whether they are theorems or not.

[1] Like many others, I have forsaken the double arrow for that purpose. The choice can be traced back to [DS90] and [AvG90].

References

[AvG90] Antonetta J. M. Gasteren. 1990. On the Shape of
       Mathematical Arguments. Springer-Verlag
       New York, Inc., New York, NY, USA.

[DS90] Edsger W. Dijkstra and Carel S. Scholten. 1990. Predicate
       Calculus and Program Semantics. Springer-Verlag New
       York, Inc., New York, NY, USA.

Tuesday, June 14, 2011

"Lifting weights with your brain"

I am posting the following just for the sake of sharing a quote I liked in an essay I'm reading:

In workouts a football player may bench press 300 pounds, even though he may never have to exert anything like that much force in the course of a game. Likewise, if your professors try to make you learn stuff that's more advanced than you'll need in a job, it may not just be because they're academics, detached from the real world. They may be trying to make you lift weights with your brain.

By Paul Graham, Undergraduation. I don't suppose it will completely eradicate the comment "What good does it do me to learn [put a beautiful piece of theory, technology, etc here]? People don't use it." Another reply of course might be to point out that "maybe they should".

Simon Hudon
June 14th, 2011
Meilen

Monday, June 6, 2011

Decidability

The proof of the undecidability of the halting problem has always fascinated me. It is a formidable result to behold, at least the first time around. It relies on the construction of a program that cannot exist because of a contradiction drawn through self-reference. Here is an interesting rendering of the argument by Christopher Strachey: An Impossible Program.

However, an interesting paper written by Eric Hehner (Problems with the Halting Problem) sheds doubt on the correctness of the usual proof. He proposes that the contradiction in the proof is due to self reference rather than because of the structure of the halting problem. It is a good read and I strongly recommend it.

Simon Hudon
Meilen
June 6th 2011

Monday, May 9, 2011

A Lecture Supported by an Automated Theorem Prover

UPDATED: added note [1] on mutual implication

Last week on monday, I attended the guest lecture that Prof. Topias Nipkow from the Technical University of Munich (TUM) gave at ETH. I had heard of him before because of his work on the Isabelle theorem prover. An interesting thing before I start: he refers to it as a proof assistant.

I didn't expect much of this talk because of my reluctance to rely on automatic provers to do my job and because I expected a very technical talk about proof strategies implemented or to be implemented in Isabelle. Since such issues need concern only people involved in the construction of theorem provers and that I'm not, I didn't think it would appeal to me.

I was pleasantly surprised that he had chosen to talk about a course he had recently started to give on the topic of semantics. Said course was noteworthy because formal methods and an automated prover were used as teaching vehicles. I am pleased to see that he decided to use formal methods as a tool for understanding the topic and that he would opt for spending more time on semantics, the subject matter, and less on logic and formal methods.

It is my considered opinion, however, that a prerequisite course on the design of formal proofs would be most useful, even necessary. I draw this conclusion because I believe that acquiring effective techniques for designing elegant proofs can help tremendously in making hard problems easier. But, since he didn't want to teach logic, he gave very little attention to the design of proofs and adopted "validity of the proofs" as the goal the students' had to reach in their assignments. It was asked --by someone else before I could ask it-- what importance he gave to the style of the submitted proofs. He said none because some (in his opinion) very good students had an awful style and he didn't want to penalize them for it. He considered that, having submitted a correct proof, the student had shown that he understood. I would say that, in doing so, he was doing his students a disservice. However, I can now better understand his position because the underlying assumption is very popular: since words like style and elegance have a very strong esthetic connotation, it becomes automatically whimsical to judge them or to encourage the students to improve upon them. After all, we're here to do science, not to make dresses!

*                         *
*

Even when it has been successfully argued just how useful mathematical elegance can be, people keep opposing its use as a yardstick on the ground that it is too subjective and enforcing one standard would be arbitrary.

It turns out that, not only are elegance and style very useful, but they can be very objectively analyzed and criticized. It has been the preferred subject of Dijkstra in the second part of his career and it appears almost everywhere in his writing --see EWD619 Essays on the nature and role of mathematical elegance for such an exposition [0]--.

The usefulness of an elegant style come from the fact that it yields simple and clear proofs requiring very little mental work from the reader (which is not something to be underestimated) but, more importantly, it allows the writer of a proof to be economical in the use of his reasoning abilities. This can make the difference between a problem which is impossible to solve and one which is easily solved. On the other hand, in such a course as what Nipkow has designed, the most interesting things for students to learn are not the individual theorems that they proved but the techniques that they used to find a proof. If the techniques are not taught, it can be expected that the skills the students acquire will be of much less use when confronted to different problems.

What an elegant style boils down to is, to put it in Dijkstra's words, the avoidance complexity generators and the separation of one's concerns. He also argued that concision is an effective yardstick to evaluate the application of those techniques. Indeed, concision is effectively achieved by both separating one's concerns and avoiding complexity generators. The most well known complexity generators are case analyses, proofs of logical equivalence by mutual implication and inappropriate naming. The first two are legitimate proof techniques but their application double the burden of the proof. It doesn't mean that they should never be used but rather that they should be avoided unless the proofs of the many required lemmata differ significantly. [1] I say significantly to stress that, in some cases, they can differ in small ways and, upon closer scrutiny, the differences can be abstracted from.

The issue of naming is also closely related to the separation of one's concerns but, being a tricky issue, I would rather point the reader in the direction of van Gasteren's book "On the Shape of Mathematical Arguments" to the chapter 15 "On Naming" which covers the subject very nicely. Since Dijkstra was van Gasteren's supervisor, he wrote the chapter with her and it has earned it an EWD number: 958 [0]. This allows me to skip directly to the matter of separation of concerns.

*                         *
*

While debating about concision and about Nipkow's lecture with my supervisor, something kept coming up in his arguments. I was favoring short formal proofs and he kept asking: "What if a student doesn't want to call 'auto' [the function of Isabelle which can take care of the details of a proof step] but wants to go into the details to understand them?" First of all, I have to point out that being the input for an automated tool doesn't relieve a proof from the obligation of being clear. [2] Unlike Nipkow, I see that the proof must include intermediate formulae accompanied with a hint of how one proceeds to find them. This would correspond to the combination of what he calls a proof script --a series of hint without intermediate formulae-- and a structured proof --a series of intermediate formulae with little to no hints; this is what he prefers to use--. In that respect, 'auto' is no better than the hint "trivially, one sees ...". The choice of how early one uses 'auto' is basically related to decomposition. It is unrelated to the peculiarities of the prover but relies on how clear (to a human reader) and concise the proof is without the details. What my supervisor and Nipkow name "using auto later in the proof" would be, in a language independent of the use of an automated prover, including more details. If the proof is clear and concise without those details, they don't belong in the body of the proof. It is that simple. One could, however, include them in the proof of a lemma invoked in one (or many) proof steps of the main proof.

By Including a lemma, one doesn't destroy concision because the proof of said lemma can be included beside that of the main theorem rather than intertwined with it. The difference is that, while one reads a proof, each step must clearly take him closer to the goal. Any digression should be postponed so as not to distract the attention from the goal. One good indication that a lemma is needed is when many steps of a proof are concerned with a different subject than the goal. For instance, when proving a theorem in parsing theory, if many successive steps are concerned with predicate calculus, the attention is taken away from parsing. Instead, making the whole predicate calculation in one step and labeling it "predicate calculus" is very judicious. Nothing prevents the proof that pertains to predicate calculus to be presented later on, especially if it is not an easy proof.

The important point here is that, sticking to the subject at hand doesn't mean forgetting that there are other problems that need one's attention. It means dealing with one problem at a time each time forgetting momentarily what the other problems are. This is exactly what modularity is about.

Furthermore, with an automatic prover, nothing prevent someone to use 'auto' to commit the sin of omission; details that would make a step clear are then missing. It is then a matter of style to judge how much should be added. This reinforces my point that good style should be taught because clarity is the primary goal with proofs [3].

With respect to such a tool, I would welcome the sight of one where keywords like 'auto' are replaced by catchwords like 'predicate calculus' to hint at the existence of a simple proof --at most five steps-- in predicate calculus --in this case-- that supports the designated step. More often, we could use invocations like 'theorem 7' (or '(7)' for short) or 'persistence rule' as a way of invoking a very straightforward application of a referenced theorem. It is clear to the reader what is going on then and to the prover, the problem is very simple: it looks for a simple proof. If no proofs of at most five steps exist, the search fails. More importantly: the user should have foreseen it. The prover should never be used to sweep problems under the carpet.

Like with the type system of a programming language, it should be easy for the human reader to see that a given proof step is going to be accepted. It is by being predictable that automatic tools are useful, not by working magic.

*                         *
*

By way of conclusion, I go on to another aspects of Nipkow's talk. He said that applying formal proofs to the teaching of computer science is especially useful in those subjects where the formalization is close to people's "intuition". In the rest of the subjects, it is a bad idea. I say this is a drawback of his approach, not one of formal proofs. If you take formal proofs as a proper input format for the mechanized treatment of intuitive arguments, it seems inevitable to run into that problem. However, formalism can be used for purposes which have nothing to do with mechanization: the purposes of expressing and understanding precise and general statements. If it is used in this capacity, formalism allows a (human) reasoner to take shortcuts which have no counterpart in intuitive arguments. It is one of the strengths of the properly designed formalisms that you can use them to attain results which is beyond intuitive reasoning.

Case in point, the topics where Nipkow says formal proofs are not an appropriate vehicle for teaching are those where it would be crucial to rely on formalisms that are not merely the translation of intuitive reasoning. To use those, we would have to stop relying on our intuition and acquire the techniques of effective formal reasoning. This leads me back to my first point. 'Effective' means that we're not interested in finding just about any proof: we are looking for a simple and elegant one so that problems for which the solution would be beyond our abilities could admit a simple solution.

In other words, striving for simplicity is not a purist's concern as much as a pragmatic preoccupation that allows us to solve with as little efforts as possible problems that are beyond the unaided minds.

Simon Hudon
ETH Zurich
Mai 11th 2011

[0] For the EWD texts, see http://www.cs.utexas.edu/users/EWD/

[1] For those who are used to using deductive systems for formal reasoning, the question "what alternative do we have to mutual implication?" might have come up. The answer is that equivalence is a special case of equality and should be treated as such. That is to say that its most straightforward use is by "substituting equals for equals" also known as "Leibniz's rule". It is also the most straightforward way to prove an equivalence. The naming of equivalence by "bidirectional implication" is a horrible deformation. It is analogous to calling equality in numbers "bidirectional inequality": it hints at a way of proving it using implication but does not distinguishes between this shape of one possible proof and the theorem and, indeed, I realize that some people immediately think of mutual implication when they see an equivalence. It's a shame.

[2] In this respect, theorem provers seem to be more primitive than our modern programming languages. Whereas they become more and more independent of their implementation to embody abstractions (for instance, in Java, there no longer is a "register" keyword for variables), the proof languages of automated provers shamelessly include a lot of specific command for what I shall call "sweeping the rest of the problem under the rug". In proofs explicitly constructed for human readers, those would be replaced by vague expressions like "well you know..." followed, if presented in a talk, by a waving of the hands intended to say "anyone but idiots will get this".

[3] This goes against what seems like a school of thought that views formal proofs as the input of tools like theorem provers. It is easy for people of that school to draw the fallacious analogy with assembler programming. The important difference is that a proof designed with style can be the vehicle of one's understanding. Since mechanisms like abstraction are routinely applied to make proofs as simple as they can be, if one wants to understand an obscure and counter intuitive theorem, an elegant formal proof is very likely to be the best way to do it. On the other hand, assembler is not a language which helps one understand an algorithm. It is clearly an input format of a microprocessor and people should treat it as such.

Special thanks to my brother Francois for applying his surgical intellect to the dissection of the objections made against the notions of elegance and style and to Olivier who also help me improve this text.

Wednesday, March 23, 2011

Done in One Step

This is a short note to relate an anecdote that just happened to me. I'm learning Rutger M. Dijkstra's computation calculus by reading his paper with the same name and I come across a very short proof.

The objects of the calculus are computations taken as predicates over state traces. The result looks like a relational calculus in that in never names the computations it is dealing with. It adds • ; ~ to the usual predicate calculus. • takes a state predicate `p', this is a predicate over traces that is satisfied only by traces of one state, and makes it a predicate satisfied by traces starting with a state satisfying `p'. It is captured by definition (13).

(13)	•p = p;true

where ; is the sequential composition of computation. `a;b' can be interpreted as constructing a computation satisfied by traces produced by executing `a' first then `b'. For the proof, all we need is the state restriction property. It applies to state predicate `p' and arbitrary computation `s'.

(8)	p;s  =  p;true  ∧  s

Finally, ~ is a negation of a state predicate that also yields a state predicates. That is to say that computations form a predicate algebra and so do state predicates, using ~ instead of ¬. And now, here is the proof. Don't spend to much time reading it, the second step is far from obvious.

	  [ •~p ⇒ ¬•p ]
	=   { Shunting, definition (13) (see below) }
	  [ p;true ∧ ~p;true  ⇒  false ]
	=   { state restriction (8) thrice }
	  [ (p ∧ ~p);true  ⇒  false ]
	=   { Predicate calculus }
	  [ false;true  ⇒  false ]
	=   { false is left zero of ; }
	  true

I decided to see for myself how the second step works and my calculation took five steps instead of one. After struggling with it for a minute or so in my head, I decided to make appeals to associativity explicit in order to consider it as a manipulation opportunity. One last thing: the weaker state predicate is 1, it plays the role of true in the related predicate algebra and we can express that any predicate `p' is a state predicate by making it stronger than 1: [ p ⇒ 1 ]. In my calculation, ~ doesn't pay a role so I replace `~p' by `q'. Therefore, we are trying to transform p;true ∧ q;true into (p ∧ q);true using equality preserving steps.

	  p;true ∧ q;true
	=   { (8) using p is a state predicate }
	  p;(q;true)
	=   { At this point, all other application 
	      of (8) would take us back a step or
	      would lead nowhere, let's use 
	      associativity }
	  (p;q);true
	=   { Now we can recognize the left hand 
	      side of (8) so we apply it using that p
	      is a state predicate again }
	  (p;true ∧ q);true
	=   { We can now take advantage of the
	      fact that q is a state predicate for the
	      first time with [ q ⇒ 1 ] or, rather
	      [ q  ≡  q ∧ 1 ] }
	  (p;true ∧ 1 ∧ q);true
	=   { Now p;true ∧ 1 matches the right
	      hand side of (8) and we can use it
	      to eliminate true and we can 
	      eliminate 1 because it is the identity
	      of ; }
	  (p ∧ q);true

There are mostly two things that helped design the above computation.

  1. We kept track of the properties of that were left unused through the calculation. It especially help to move forward

  2. Realizing that some intermediate formulae would contain a series of consecutive ; and that we might get to manipulate different groupings lead us to make the appeals to associativity explicit and made apparent the best way to use (8) for the second time.

Although I'm happy to use the opportunity to practice various techniques for designing formal proofs, the writing of the above was prompted by a (mild) dissatisfaction at Dijkstra's proof for leaving out most of the helpful and relevant bits.

Simon Hudon
March 23rd, 2011
ETH Zurich

Wednesday, March 9, 2011

Calculating with Partitions

In the book of Chandy & Misra [CM88], I found an interesting consequence of three variables of which exactly one is true. To begin with, such a relation can be formalized as follows:

(0)	[ x ∨ y ∨ z ]
(1)	[ ¬(x ∧ y) ]
(2)	[ ¬(x ∧ z) ]
(3)	[ ¬(y ∧ z) ]

Note

Before proceeding with the interesting fact, let's have a closer look at the square brackets. They come from the book of Dijkstra & Scholten and they constitute a form of implicit universal quantification. If x, y and z are normal boolean variables, it can be omitted. However if they are predicates ([DS90] introduces them as boolean structures), you can leave out their arguments and it is considered that they are universally quantified over. In such a case,

	[ x ∨ y ∨ z ]

would be equivalent to

	〈∀ i:: (x ∨ y ∨ z).i〉 

and predicate application is considered to distribute over the logical connectives so it is, in turn equivalent to

	〈∀ i:: x.i ∨ y.i ∨ z.i〉 

Since i doesn't play a role in our manipulations, it simplifies the formula and our work to leave it out.

This means that, if x, y and z are the characteristic predicate of sets X, Y, Z, our set of constraint states that they partition the type over which they range.

(end of note)

Now the interesting consequence is this one:

(4)	[ ¬x  ≡  y ∨ z ]

Its proof is nice and simple. The usual heuristics to design a proof of such a theorem is either to manipulate the whole formula or to transform one side into the other. Since we don't have rules to relate either of y or z to ¬x, we choose to manipulate the whole formula. Also, we need to join ¬x and y ∨ z with either a disjunction or a conjunction because of the shape of the rules that we have. The golden rule, stated below, is a good candidate to achieve just that.

(5)	[ x ∨ y ≡ x ≡ y ≡ x ∧ y ]	("Golden rule")

For those unfamiliar with the rule, they can convince themselves of the validity or the rule by seeing that x ⇒ y can be equivalently formulated as follows:

	x  ≡  x ∧ y
	x ∨ y  ≡  y

By transitivity, they are equivalent. Hence, the golden rule. Now for the proof of (4):

	  ¬x  ≡  y ∨ z
	=   { ¬ over ≡ to get ¬ out of the way }
	  ¬(x  ≡  y ∨ z)
	=   { Golden rule }
	  ¬(x ∧ (y ∨ z)  ≡  x ∨ y ∨ z)
	=   { (0) }
	  ¬(x ∧ (y ∨ z))
	=   { ∧ over ∨.  It acheives the
	      grouping of (1) to (3) }
	  ¬( (x ∧ y) ∨ (x ∧ z) )
	=   { ¬ over ∨ }
	  ¬(x ∧ y) ∧ ¬(x ∧ z)
	=   { (1) and (2) }
	  true

This simple proof is six step and it carries an inactive negation around needlessly. We could remove one step and the noise of the negation if we simply realize that

	¬(x  ≡  y)  ≡  x  ≢  y

Also, we can use the golden rule over ≢ if we realize that replacing an even number of ≡ by ≢ in a sequence of equivalences leaves the truth value unchanged.

	  x  ≢  y ∨ z
	=   { Golden rule }
	  x ∧ (y ∨ z)  ≢  x ∨ y ∨ z
	=   { (0) }
	  ¬(x ∧ (y ∨ z))
	=   { ∧ over ∨ }
	  ¬((x ∧ y) ∨ (x ∧ z))
	=   { ¬ over ∨ }
	  ¬(x ∧ y) ∧ ¬(x ∧ z)
	=   { (1) and (2) }
	  true

It is nicer this way: the proof is shorter and ¬ is inactive only for one step. We can suppose that it can be generalized to more variables. Indeed, (0) to (3) can be generalized by (6) and (7) and we can replace x, y, z by x.i for any appropriate i.

(6) [〈∃ i:: x.i〉 ]
(7) [〈∀ i,j: i ≠ j: ¬(x.i ∧ x.j)〉 ]

A first attempt at generalizing (4) might yield

(4')	[ x.j  ≢ 〈∃ i: i ≠ j: x.i〉 ]

but we can do better. We don't need to only have on variable on the left hand side of ≢, we could have many, like on the right hand side. The important thing is that variables appear on exactly one side. To formalize this, we can use an arbitrary separating predicate P. The x.i such that P.i holds will be on the left and the others will be on the right.

(4'')	[〈∃ i: P.i: x.i〉 ≢〈∃ i: ¬P.i: x.i〉 ]

Notice how nicely symmetric (4'') is. The same technique can be used to prove it as what we used for (4).

	 〈∃ i: P.i: x.i〉 ≢〈∃ i: ¬P.i: x.i〉 
	=   { Golden rule }
	   〈∃ i: P.i: x.i〉 ∧〈∃ i: ¬P.i: x.i〉 
	  ≢〈∃ i: P.i: x.i〉 ∨〈∃ i: ¬P.i: x.i〉 
	=   { ∧ over ∃ and nesting; merge ranges }
	 〈∃ i,j: P.i ∧ ¬P.j: x.i ∧ x.j〉 ≢〈∃ i:: x.i〉 
	=   { (6) }
	  ¬〈∃ i,j: P.i ∧ ¬P.j: x.i ∧ x.j〉 
	=   { ¬ over ∃ }
	 〈∀ i,j: P.i ∧ ¬P.j: ¬(x.i ∧ x.j)〉 
	⇐  { P.i ∧ ¬P.j ⇒ P.i ≠ P.j and the 
	      contrapositive of Leibniz's
	      principle applies }
	 〈∀ i,j: i ≠ j: ¬(x.i ∧ x.j)〉 
	=   { (7) }
	  true

Our next investigation is whether or not (4'') is sufficient to characterize the kind of partition defined by (6) and (7). To do so, we prove that (4'') ≡ (6) ∧ (7). Mutual implication is an intesting choice of strategy since half of the work is already done and we only have a weaker proof obligation to take care of. We can prove (6) and (7) separately under the assumption of (4'')

Proof of (6)

	 〈∃ i:: x.i〉 
	≠   { (4'') with P.i ≡ true }
	 〈∃ i: false: x.i〉 
	=   { Empty range }
	  false

For the proof of (7), we have to notice the formal differences between (7) and one of the sides of (4''). First of all, (7) has more dummies and has a universal quantification instead of an existential one. Second, the term of the quantification has two (negated) occurrences of x rather than just one positive one. What we must do is split the dummies between two quantifications --nesting should do nicely--, then move the second x term out of the inner quantification and use one of the negations to change the ∀ into ∃. The formula should then be amenable to an application of (4'').

Proof of (7)

	 〈∀ i,j: i ≠ j: ¬(x.i ∧ x.j)〉 
	=   { Nesting to separate the dummies }
	 〈∀ i::〈∀ j: i ≠ j: ¬(x.i ∧ x.j)〉 〉 
	=   { ¬ over ∧ then ∨ over ∀ to keep only one
	      x term inside the inner quantification }
	 〈∀ i::〈∀ j: i ≠ j: ¬x.j〉  ∨ ¬x.i〉  
	=   { ¬ over ∃ }
	 〈∀ i:: ¬〈∃ j: i ≠ j: x.j〉  ∨ ¬x.i〉  
	=   { Now we have the shape we were looking for
	      we can use (4'') with P.i  ≡  i = j }
	 〈∀ i::〈∃ j: i = j: x.j〉  ∨ ¬x.i〉  
	=   { One point rule }
	 〈∀ i:: x.i ∨ ¬x.i〉  
	=   { Excluded middle and identity of ∀ }
	  true

Luckily enough, after the application of (4''), the steps are of the kind "there is only one thing to do". Without context, the rest of the proof could look like a giant rabbit but, since (4'') is the main part of the rules that we can apply, it is reasonable to expect that we should create an opportunity to apply it and each step before its application bridges the formal difference between the two.

The above proof has been presented as a nice example that an elegant proof can be designed using only the properties of the formulae rather than their interpretation and to present some more advanced heuristics to do so.

The investigation has been prompted by my delight at seeing how nicely the first parts of the problem were solved using the golden rule but the decision to show the proof was taken because of my surprise at how easily the last proof came to me with a simple analysis of the formal differences between the two main formulae although I had expected it to keep me busy for a long time.

Simon Hudon
March 8th, 2011
Meilen

References

[CM88] K. Mani Chandy and Jayadev Misra, Parallel Program Design:
       A Foundation, Addison-Wesley, 1988

[DS90] Edsger W. Dijkstra and Carel S. Scholten, Predicate Calculus
       and Program Semantics, Texts and Monograph in Computer
       Science, Springer Verlag New York Inc., 1990

Tuesday, November 9, 2010

On the Notion of Ghost Variables

When one reads the literature of formal methods, especially that of formal verification, one encounters the notion of ghost variable also called model variable. In short, they are variables added to a program for the sake of verification to express simply and concisely what is the data abstraction a certain module is supposed to be implemented. They reflect very well the a posteriori nature of verification because they add the variables to a presumably existing program. They are characteristically absent from the regular data flow and control flow and, as such, their values need not be stored during the execution of the program. They are mere abstractions.

The naming seems to come from the fact that the predominant role the verification community gives to a program variable is to be an abstraction for the storage of a category of important values. A variable for which the values need not be stored then feels kind of odd: it looks like a variable but it's not really one. Hence the special attribute is added to warn anyone reading the program that this is a strange sort of variables.

We can take a different view with respect to variables though. If we try to understand a problem to then create a program to solve it and a compelling argument for trusting the results of the latter, we need a way to speak of the problem. Namely, it is important to identify the central abstractions, the objects of central importance and give them a name so that their properties of interest can be spelled out using that name rather than mainly in terms of their relation to other objects. One such example is given by:

Jim and Julia's brother play football together. During one match, Jim tackled Julia's brother and hurt him badly.

Compare it to:

Jack and Julia are siblings. Jim and Jack play football together. During one match, Jim tackled Jack and hurt him badly.

Notice how disentangled the second version of the story is. Every sentence states a fact and, if we are interested in the accident, we realize that Julia has very little role to play in it and the first sentence can be ignored altogether without missing any information. In the first version of the story, Julia and her relationship to one of the players is dragged all through the story although, as far as we can see, she plays no role in it herself. More about the issue of naming in mathematical arguments (which is relevant in software design) can be found in [0], in chapter 15 which is dedicated to the subject.

The point I am coming at with all this rambling on naming is that, during the analysis of a programming problem and the design of a solution, the introduction of variables should be seen as the introduction of names for the relevant quantities. The fact that the binary representation of the exact set of variables introduced should not be of immediate concern. A confidence that an efficient representation can be found later on is healthy but anymore thoughts about them should be postponed. What Dijkstra calls a change of coordinates in [1] can be applied later to improve efficiency. In the mean time, the prime concern is to establish sufficient conditions for an implementation to be correct. It is also possible that the set of variables chosen during the design process don't have to be transformed in order to get an efficient implementation and, again, it is of no concern while we come up with them.

When I say those problems are of no concern, it means we should concentrate on them but also, we should not let the vocabulary that we use reflect a distinction that we are not ready to make. For that purpose, I would abandon the notion of model or ghost variables or, at least, downplay there role considerably and introduce the notion of stored variables. Those are the variables of the final implementation for which the value has an impact over the control flow and the data flow directed to the output variables. By symmetry, we can call the others model or ghost variables but, since we reduce their importance to the final implementation stage and that, even there, they are the ones to be left out rather than kept, omitting to name them would not be a big methodological impediment.

Regards, Simon Hudon ETH Zürich Tuesday the 9th of November 2010

References

  [0] On the Shape of Mathematical Arguments, A.J.M. van Gasteren   [1] EWD1032 - Decomposing an Integer in a Sum of two Squares, E.W. Dijkstra, available at here

Monday, October 18, 2010

The Segment of Maximum Sum

This is the recording of a solution found together with Bernhard Brodowsky where he did most of the work in one of the first lectures where I explained the techniques to him. The problem is to find a segment of an array for which the sum of the elements is maximal. We stopped when we had a simple specification of the body of the loop for fear that actually calculating the assignments would be too much tedium. In this recording, I will calculate them to see exactly how tedious it is. Also, we did not pay any mind to calculating too many auxiliary values --that is using them before updating the variable holding it in the loop body so that one is always thrown away--. In this recording, the solution will be changed slightly to avoid that.

First, here is the postcondition that we want to implement:
 P0: r = 〈↑ p, q: 0 ≤ p ≤ q ≤ N: 〈∑ i: p ≤ i < q: X.i〉〉
where ↑ denotes the maximum of two operands. Like the conjunction and the disjunction, it is idempotent, associative and symmetric so we can use it in the same was as the universal and existential quantification. The first thing to do is to introduce a name for the summation part so that we can manipulate it independently of the overall specification.
 (0) S.p.q = 〈∑ i: p ≤ i < q: X.i〉
And the specification becomes:
 P1: r = 〈↑ p, q: 0 ≤ p ≤ q ≤ N: S.p.q〉
The traditional way to get an invariant from this is to replace N by a variable, let's call it n, and use n = N as an exit condition:
 J0: r = 〈↑ p, q: 0 ≤ p ≤ q ≤ n: S.p.q〉 
 C0: n = N
By symmetry, we can assume that 0 is a proper initial value for n which makes S.0.0 a good one for r.
 I0: n = 0
 I1: r = 0
Last formality: having n increase by one at each iteration will guarantee termination provided 0 ≤ N.
 S0: n' = n + 1
Let's now start calculating with J0 to see what adjustments we have to make to r to preserve it. Two distinct heuristics get us to start calculating with the right-hand side of the equality: it is much more complicated than the left-hand side so we can assume that most of our moves will be eliminations and all the program variable involved in the right-hand side don't have a matching assignment so we would basically be stuck right at the start. For a given loop with invariant J and body S, the invariance of J is assured by:
 J'  ⇐  J ∧ S
where J' is J with all its program variables primed. It means we can use whatever is part of the body or the invariant to prove J's invariance, to do so, we can invent new statements for the body as we go.
    〈↑ p, q: 0 ≤ p ≤ q ≤ n': S.p.q〉 
  =   { S0 }
    〈↑ p, q: 0 ≤ p ≤ q ≤ n+1: S.p.q〉 
  =   { Split range }
      〈↑ p, q: 0 ≤ p ≤ q ≤ n: S.p.q〉
    ↑ 〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉 
  =   { Use final value of a new variable using
         Q0: s' = 〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉 and J0 }
    r ↑ s'
  =   { New statement: S1: r' = r ↑ s' to conclude
         the proof of J0's invariance }
    r'

 Q0: s' = 〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉
 S1: r' = r ↑ s'
Q0 links unprimed variables with primed variables in a way that cannot be evaluated. Let's turn it into an invariant to fix this.
    〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉
  =   { S0 }
    〈↑ p: 0 ≤ p ≤ n': S.p.n'〉
  =   { J1' }
    s'
where
 J1: s = 〈↑ p: 0 ≤ p ≤ n: S.p.n〉
We now have to make sure that J1 is also invariant and use the same heuristic as for J0 to do so.
    〈↑ p: 0 ≤ p ≤ n': S.p.n'〉
  =   { S0 and split }
    〈↑ p: 0 ≤ p ≤ n: S.p.(n+1)〉↑ S.(n+1).(n+1)
  =   { (1), see below ; S.n.n = 0 }
    〈↑ p: 0 ≤ p ≤ n: S.p.n + X.n〉↑ 0
  =   { + over ↑ }
    (〈↑ p: 0 ≤ p ≤ n: S.p.n〉+  X.n) ↑ 0
  =   { J1 }
    (s + X.n) ↑ 0
  =   { New statement: S2: s' = (s + X.n) ↑ 0 }
    s'

 S2: s' = (s + X.n) ↑ 0

 (1) S.p.(n+1) = S.p.n + X.n   ⇐   p ≤ n
(1) is a theorem that we calculated because we had a term S.p.(n+1) that we needed to replace by one formulated in terms of S.p.n to make it possible to use J1 to eliminate the quantified maximum. Proof of (1):
    S.p.(n+1)
  =   { (0) }
    〈∑ i: p ≤ i < n+1: X.i〉
  =   { Split range using p ≤ n }
    〈∑ i: p ≤ i < n: X.i〉+  X.n
  =   { (0) }
    S.p.n + X.n
end of proof All that is missing now to have a complete program is to add an initialization for s. An examination of J1 tells us that for n = 0, S.0.0 is an appropriate value for s. We now get:
   n = 0 ∧ r = 0 ∧ s = 0
 ; while n ≠ N do
    n' = n+1  ∧  r' = r↑s'  ∧  s' = (s + X.n) ↑ 0
   od
This would be a simple enough program to execute except for S1 which has primed variables on both side of the equality. Let's see if we can find a sequence of assignments (rather than a conjunction of equalities) that will do the same as S0 ∧ S1 ∧ S2. The main law that we use for calculating assignments is
 (2) s := E ; P  ≡  P [s \ E] 
where P [s \ E] is the same as P except for the occurrences of s which are replaced by E. The way in which we will make assignments appear is by replacing terms of the initial specification by specifications of the form x' = x so that we can eventually replace the whole boolean specification by a skip and keep only the assignments.
    n' = n+1  ∧  r' = r↑s'  ∧  s' = (s + X.n) ↑ 0
  =   { Let's deal with s' to introduce s' = s 
         and solve the assignment to r }
      s := (s + X.n) ↑ 0  
    ; (n' = n+1 ∧ r' = r↑s' ∧ s' = s)
  =   { Leibniz }
      s := (s + X.n) ↑ 0
    ; (n' = n+1 ∧ r' = r↑s ∧ s' = s)
  =   { Assignment (either of r or n will do.  Let's pick
         them in inverse order of introduction }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; (n' = n+1 ∧ r' = r ∧ s' = s)
  =   { One last time, assignment }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; n := n+1  
    ; (n' = n ∧ r' = r ∧ s' = s)
  ⇐   { Introduce skip }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; n := n+1  
    ; skip
  =   { Identity of ; }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; n := n+1
It is a bit disappointing to see that the order between the assignment to r and that to n is irrelevant and we could have used a multiple simultaneous assignment for them but, then, we would have noticed that the choice of grouping n with r or grouping it with s is also irrelevant. One way or another, an irrelevant choice has to be made. By interpreting the initialization in a similar way, we get the following program:
   n, r, s := 0, 0, 0
   { invariant J0 ∧ J1 }
 ; while n ≠ N do
      s := (s + X.n) ↑ 0
    ; r := r ↑ s
    ; n := n + 1
   od

Monday, August 30, 2010

Modularity in Design
versus
Modularity in Code

At the moment, the widespread use of libraries is characterized by the reuse of code in various contexts outside the one which gave it birth. It has been made much easier by the development of .NET --which I salute as being a great technical achievement-- because .NET allows the combination of code written in different languages. I believe it is a stepping stone with respect to exposing programming languages as reasoning aids rather than execution models and it is only hampered by Microsoft's lack of commitment towards portability.

However, I suggest that, although it is an important step, it is not satisfactory as far as the concern for reuse goes. I propose that we start thinking of the reuse of the thoughts behind the code and that they are the things of foremost importance. It does not mean that reusing code is not important -- assuming we preserve the documentation of its interfaces-- but it does not go far enough. Indeed, in code, all the different ideas justifying the design choices are mangled together and one can hardly look at each aspect in isolation. In the community of programming languages, several attempts have been made to provide better modules but, so far, we still see a lot of code where different concerns end up intertwined with one another and I expect it will remain the case.

I think that a good modularity cannot be achieved in (what is understood today as) a programming language because of the universal assumption that the code a programmer types has to be directly (through compilation) executable.

My favorite example comes from parallel programming. In the field of programming language design, many mechanisms have been proposed to make the task of parallel programming easier. It includes semaphores, monitors and message passing. They make it easier to avoid low level design errors like race conditions but they still don't address design issues. In each case, the code is still organized in processes; since the behavior of a parallel program is really a property of the combination of the processes, a change to said behavior requires a change of many modules when the modular boundaries are drawn around processes.

Aside: I consider threads to be a special kind of processes and I don't think that, at the design level, differentiating between the two is of any help at all. I will therefore not mention threads again. (end of aside)

As a comparison, UNITY ([CM88]) uses action systems as the basic unit of composition. An action system has some local variables and can share variables with its environment. It also has a set of atomic actions that can modify the variables. The important thing is that it is possible to make assumptions about how the data is going to be manipulated, to enforce the assumptions and to use them to prove some safety and progress properties. When someone wants to change the functionality of a program, either he can add a new module built around the new properties or he can add the properties to an existing module, if it is close enough to what the module was built for. The key here is to see that programs properties (specifically safety and progress properties) are closely mapped to modules so we have a very good encapsulation.

At the end of a development, a UNITY system consists of a set of actions and a set of variables both partitioned between modules. If someone wants to execute the system, he needs first to choose the platform on which he wants to execute it (e.g. it could be a network of computers or a network of processes residing on one computer) and then partition the various variables and actions of the system between the components of the platform without consideration for the modular boundaries. Some variables might become channels between processes or between computers, others may be stored in the memory of a computer, private to one process or shared and possibly protected by locks.

In sequential programming, the control flow is something that can be closely tied to program properties and it can therefore be safely attributed to a modular unit --such as a procedure-- because the properties of interest --preconditions, postconditions and termination-- closely apply to syntactic structures describing the control flow. That is to say that pre- and postconditions are paired and, between them lies a sequence of statement which are structured in a proper fashion for proving correctness. Similarly, loops are associated with invariants and variants and nothing outside of the loop is relevant for reasoning about said loop.

In contrast, [FvG99] shows the kind of proof obligation one has to contend with when working with processes in parallel programming. In their theory, each assertion has local correctness, which is analogous to partial correctness in sequential programming, and a global correctness which relies on the structure of the other processes. At this point, it is unthinkable to take one process away and to expect that the properties one has proven about it will hold in a different environment. In short, the interface between processes, when made explicit, is too far from a thin interface to allow processes to properly encapsulate anything.

In UNITY, the fact that it is based on action systems rather than processes makes it possible to attribute a precise and small set of properties to a module and work in isolation on it. The usual process is called refinement and it makes all the intermediate design decisions explicit. It makes the effort spent on the development easier to transpose to different situation even when none of the code produced is actually useful.

The UNITY method has proof rules and formal semantics and the most rigorous approaches will take advantage of them. However, it is not necessary to use the whole battery to reap the benefits of the method. Since modularity is something they did well with UNITY, using it and its temporal logic as a guideline for design can already be beneficial in pointing out what we must be careful about.

In short, I believe that it is a property of paramount importance for modular boundaries to confine each programming task to the construction or modification of only one module. The best possible way to achieve that is to choose a unit of modularity for which the proofs of correctness need only rely on very few, simple and chosen a priori --that is, chosen as much as possible independently of any implementation-- assumptions of the surrounding environment. Some might say that it is an obvious goal but as far as I can see, very few modular decomposition mechanisms are satisfactory in that respect and I think that most language designers are still confined by "how the program is going to be executed" so are unable to meet my criteria.

*                         *
*

I know of another effort directed at the reuse of design: the so-called design patterns. The idea is a good one but its realization is much too weak. They are basically pattens of modular decomposition which have proved useful in many situations. However, there is no basis to state precisely what properties make them useful or how they meet those properties. In short, they cannot have an existence of their own because no notion of correctness apply to them. I do recognize it as a step in the right direction but it seems over-hyped for what it provides. I would welcome the publication of such books where designs are proven correct, where it is (formally) precise where the modules fit in a larger design and how one should proceed to code them correctly.

Next, I would have like to address some of the objections to formal methods that I have heard again and again and that used to unsettle me. However, this post is already pretty long and, by now, it is long overdue. I will, therefore, leave it for a post which, hopefully, will be coming soon.

Simon Hudon
April 10th, 2011
Meilen

References

[CM88] K.M. Chandy and J. Misra, Parallel Program Design:
       A Foundation, Addison-Wesley, 1988

[FvG99] W.H.J. Feijen, A.J.M. van Gasteren, On a Method of
       Multiprogramming, Springer-Verlag New York, Inc.,
       1999

Thursday, August 5, 2010

The Development of a Solution to the Problem of the Cubes

Note:
Before I begin, let me point out that the publication of this blog entry is made possible because I haven't used Blogger's interface to edit it (or at least, I used it as little as possible). Instead, I wrote it in a plain text editor using a markup language I am developping. Since I don't have developed an HTML generator yet, I had to convert it myself to HTML. Blogger did not make it easy but I think the result is much better than what I got previously by just fighting with the web interface. I would certainly appreciate any feedback concerning the resulting layout.
(end of note)

The first statement of this problem that I have seen is in Wim Feijen's note WF114 [0]. I really like the structure of his presentation but, whenever I have reproduced it, either for myself of to present the problem and a method to find a solution to somebody else, I have made on slight change: instead of using predicate transformers semantics, I use a relational semantics and I believe the switch to be quite significant from a practical standpoint. I still use the invariant based loop development techniques developped by Dijkstra & cie instead of the recursion / precondition technique of Hehner and I think it is very characteristic for my style. It shows that I prefer to manipulate one small part of the program at a time.

DEVELOPMENT ===========

The problem at hand consists in producing an array with the N first cubes without resorting to multiplication operations. Formally, the postcondition we are trying to satisfy is:
 P0:   (A i: 0 ≤ i < N: f.i = i^3)
with f the output variable (a function). About notation '.' is the function application operator and (A i: R: T) is a universal quantification over the range R and with the term T. Here, we use the property that ≤ and < are mutually conjunctional which means that their combined application can be unfolded into
 a ≤ b < c  ≡  a ≤ b ∧ b < c
They are also mutually conjunctional with equality (=) whereas logical equivalence (≡) is not conjunctinal at all because it is associative and the two features of the operator would interfere. (end of note) We can now proceed to find a suitable invariant for our loop. A good technique to do so is to replace some constants in the postcondition by variables. Since it is a quantification, it seems reasonable to focus our search in the range. We find 0 and N. We could choose either or both but if we use 0 as our initial value and N as the final one, we get the first cube for free. We will therefore opt for replacing only N with a variable:
 E0:   k = N
 J0:   (A i: 0 ≤ i < k: f.i = i^3)
If we start k off at 0, the range of J0 is becomes false and the invariant is therefore trivially true. From there, it is not a big step to say that increasing k regularly will lead us to E0 (our variant is then N-k).
 I0:   k = 0
 S0:   k' = k + 1
It is now time to see how we will maintain J0 (in the loop) while applying S0 to converge. Note on proof obligations If we use the technique of the invariant to build our loop, we have to fulfill a certain certain proof obligation. Assuming that J is the invariant, B is the body of the loop, E is the exit condition, IN is the initialization of the loop and P is the postcondition:
 (0)   J'  ⇐  J ∧ B ∧ ¬ E    "B (and E) preserve(¬s) J"
 (1)   J  ⇐  IN              "IN establishes J"
 (2)   P  ⇐  J ∧ E           "J and E lead to P"
In the preceding definitions, primed predicates are predicates for which all unprimed program variables are replaced with their primed version. Unprimed variables designate their initial values (before the execution of a given statement or program) and the primed variables designate their final values.
 (3)     (B and E preserve J) 
       ∧ (IN establishes J) 
       ∧ (J and E lead to P)
    ≡ 
       { IN } until E do B od { P }
The last line is to be interpreted as a Hoare triple with assertions between brackets. It reads "started in a state satisficing IN, the program [here the while loop], finishes in a state satisficing P." (end of note)
   (A i: 0 ≤ i < k': f'.i = i^3)
 =   { S0 }
   (A i: 0 ≤ i < k + 1: f'.i = i^3)
 =   { Split off one term }
   (A i: 0 ≤ i < k: f'.i = i^3) ∧ f'.k = k^3
 =   { Modify only f.k; S1: (A i: 0 ≤ i < k: f'.i = f.i) }
   J0 ∧ f'.k = k^3
 S1:   (A i: 0 ≤ i < k: f'.i = f.i)
The last line of the provious proof makes a nice assignment. The only problem is that it requires some multiplications, we will continue out calculations rather than adopt it as our next assignment.
   f'.k = k^3
 =   { Let's introduce a new variable to hold k^3 }
     { at all time.  J1: a = k^3                  }
   f'.k = a
 S2:   f'.k = a
 J1:   a = k^3
If we added the symmetric complement of S1, that is that everything beyond k in the values of f stays unchanged but it is not required yet. We add it afterwards to complement S1 and S2 as array assignments but it is not urgent.
 (4)   (J0'  ≡  J0 ∧ S2)  ⇐  J1 ∧ S0 ∧ S1
Since the left argument of ⇐ is monotonic, we can weaken (4) by weakening its left-hand side and replacing the equivalence by a consequence (⇐). After applying the shunting rule (see (A) in the appendix), we have exactly the shape of the proof obligation for maintaining an invariant... except that invariant used to maintain J0 (i.e. J0 ∧ J1) is much stronger than J0. We can prove that we can maintain it too, though.
 (5)   J0'  ⇐  J0 ∧ J1 ∧ S0 ∧ S1 ∧ S2
To maintain J1, we will split it in two, manipulate the left-hand side of the equality since we know a lot about k and hope it will lead to operations on a.
   k'^3
 =   { S0 }
   (k + 1)^3
 =   { Unfold _ ^3; ∙ over + }
   k^3 + 3∙k^2 + 3∙k + 1
 =   { J1 }
   a + 3∙k^2 + 3∙k + 1
 =   { We need a new variable to hold the term          }
     { which contains a product J2: b = 3∙k^2 + 3∙k + 1 }
   a + b
 =   { We want to fall back on J1'.  S3: a' = a + b is  }
     { good candidate.                                  }
   a'
 J2:   b = 3∙k^2 + 3∙k + 1
 S3:   a' = a + b
 (6)   J1'  ⇐  J1 ∧ J2 ∧ S0 ∧ S3
It seems like we have just created some more work for ourselves but we could be reasured by the sight of the decreasing order of the equations in our new invariants.
   3∙k'^2 + 3∙k' + 1
 =   { S0 }
   3∙(k + 1)^2 + 3∙(k + 1) + 1
 =   { ∙ over + }
   3∙k^2 + 6∙k + 3 + 3∙k + 3 + 1
 =   { J2 }
   b + 6∙k + 6
 =   { You know the drill! J3: c = 6∙k + 6 }
   b + c
 =   { S4: b' = b + c }
   b'
 J3:   c = 6∙k + 6
 S4:   b' = b + c
 (7)   J2' ⇐  J2 ∧ J3 ∧ S0 ∧ S4
Now for the last touch on the invariant:
   6∙k' + 6
 =   { S0 }
   6∙k + 6 + 6
 =   { J3 }
   c + 6
 =   { S5: c' = c + 6 }
   c'
 S5:   c' = c + 6
 (8)   J3'  ⇐  J3 ∧ S0 ∧ S5
We can now happily conclude
 (9)   S6 preserves J4
with
 S6:   S0 ∧ S1 ∧ S2 ∧ S3 ∧ S4 ∧ S5
 J4:   J0 ∧ J1 ∧ J2 ∧ J3
We started off knowing that J0 ∧ E0 ⇒ P0 so all that is left for a correct loop is to find the initialization. We already have I0 and, from there, we can easily calculate the initial values of the auxiliary variables. Let's ignore the calculations and see the initial values right away.
 I1:   a = 0
 I2:   b = 1
 I3:   c = 6
We can see that there is no value specified for f and, as a matter of fact, any value will do. We now have the final lemma for the correctness of this loop.
 (10)  I0 ∧ I1 ∧ I2 ∧ I3 establishes J0 ∧ J1 ∧ J2 ∧ J3
CONCLUSION ========== Looking at the end result, we can see that we ended up with a series of assignments (SX) but we haven't specified an order in which they have to be executed. It is because it is both simple and tedious to come up with an order and the order so arbitrary that it is all noise. For that reason, we can trust a compiler to do it for us. When we have to deal with a programming languages that doesn't support multiple simultanous assignments, we have to take the time to introduce the noise that is the order ourselves. The key here is that we must not overwrite a variable before we are done with its initial value. We have a bit of a problem when we run into an assignment where the use and the writing of variables are cyclicly dependant like in the following:
 S6:   x' = y ∧ y' = x
To solve the problem, we can add an auxiliary variable (say, t) and initialize it so that it can be substituted for every use of a certain variable.
 I4:   t = x
   x' = y ∧ y' = x
 =   { I4 }
   x' = y ∧ y' = t
 =   { Assignment law (B) }
   (x := y) ; (x' = x ∧ y' = t)
 =   { Assignment law (B) }
   (x := y) ; (y := t) ; (x' = x ∧ y' = y)
 ⇐    { Skip introduction (C) }
   (x := y) ; (y := t) ; skip
 =    { Identity of ; (D) }
   (x := y) ; (y := t)
And, of course, it is also simple to implement I4 and prepend it to the program we just derived. Note: Since we are interested in a program that ends up with I4, we are going to use primed variables. (end of note)
   t' = x ∧ x' = x ∧ y' = y
 =   { Assignment law (B) }
   (t := x) ; (t' = t ∧ x' = x ∧ y' = y)
 ⇐   { Skip introduction (C) }
   (t := x) ; skip
 =   { Identity of ; (D) }
   t := x
In short, this is why I don't bother to do it with the body of my loop. Finally, the main interest of this programming exercise does not lie in the resulting program. Quite frankly, it accomplishes quite a banal task. However, the technique which led us to it is quite systematic. We used very little knowledge to find it and, unlike Feijen's version, we did not introduce assignments of '?' to mark the necessity to create a new assignments to then 'guess' what should be the expression. To be honest, it is not a very hard guess but it is still outside of the scope of a calculation. The fact that I integrated that choice in a calculation makes it easier to justify. Regards, Simon Hudon ETH Zürich APPENDIX ======== Unusual laws of propositional calculus
(A)   A ∧ B ⇒ C  ≡  A ⇒ (B ⇒ C)    { "shunting" }
Hehner's programming laws [1]
(B)   (x := E) ; P  ≡  P [x \ E]     
              { "assignment law" P [x \ E] is the }
              { variable substitution notation,   }
              { P is a predicate, x a program     }
              { variable and E an expression      }
(C)   x' = x  ⇐   skip              
              { For any program variable x }
(D)   P ; skip  ≡   P                
              { For any program predicate P }
REFERENCES ========== [0] Wim Feijen, WF114 - The problem of the table of cubes, http://www.mathmeth.com/wf/files/wf1xx/wf114.pdf [1] Eric Hehner, A Practical Theory of Programming, http://www.cs.toronto.edu/~hehner/aPToP/

Sunday, August 1, 2010

A Prehistoric Qualification of Languages: Succinctness

I read this morning the essay Succinctness = Power [0] by Paul Graham which, so far, I find (note the present tense) to be quite smart. I can't believe he would use such a metric as lines of code or number of symbols as a meaningful metric to compare implementations with. He says that one of the goal of high level programming languages is to be more succinct for expressing a given amount of machine code. Of all people, I would expect that somebody who spent so much time with LISP would see the the fallacy. The idea and benefit of a high level language is that it allows you to think in higher level IDEAS. Those ideas can be implemented using half a machine code instruction or a thousand, it is of no relevance for its use. What IS relevant is whether the properties of that particular abstraction are good enough to fit in the context of the solutions but include very little (I am tempted to say none at all) information which does not pertain to the context it will be used in. And, finally, this is something LISP is interesting for because it drives home the message that those particular abstractions can be encapsulated in functions or data types but you can also build up the language to cover it too and It is not too say I prefer long solutions but I see succinctness as a consequence --and not the main one at that-- of a language's "power" (which I find to be a poor choice of adjective especially since he does not define how he uses it right away). Its real power lies in the variety of abstractions that one can define and use. If you exaggerate the aim for succinctness, you end up with APL or PERL which are notoriously bad. Their goal is not abstraction but packing as many operations in tiny expressions. As a consequence, those solutions are unreadable and they just seem like dignified way of writing short machine code. It gives no more flexibility and no more readability. This comment was mostly about the beginning the essay. I don't have much to say about the rest except that, at some points, he can expose some really nice insights but it really started off on the foot and I didn't expect the rest of the essay to compensate for that and, in my view, it didn't. [0] Paul Graham, Succinctness = Power, http://www.paulgraham.com/power.html Regards, Simon Hudon ETH Zürich August 6th 2010

Wednesday, July 28, 2010

I hate powerpoint slides

I am taking a break in my study session to write about powerpoint slides as they are used by lecturers. Actually, it is such a general phenomenon that powerpoint is more the name of one of the most popular product to produce those atrocities than the only one to do so. Since this is a general post for ranting, even if I try to rationalize what is going on, I will focus on how much I hate them. It seems that most of them are produced as a combination of cue cards for the lecturer and of visual entertainment for the childish students. It means that, although they can be effective at entertaining those with short attention span (and I mention that I am of those with a short attention span but not of those who are entertained) they are quite ineffective as basic material for any lectures. This is caused by their cue card structure. Those cards are used to give directions for a speaker and they need not contain complete sentences since the material should be in the speaker's head. As a consequence, bad English, ambiguous statements and inaccurate vocabulary plague them and it would be alright if the cue cards were to be either discarded after use or kept privately for future repetitions of the lecture by the same person. I don't know if it is possible to design proper slide to act both as visual aid and supporting material for a course but, as I am about to start an academic career, I am completely opposed to their use in my lectures. I would rather use a combination of reading assignments and blackboard presentation with some possible rare exception. That being said, I can now resume my study of the most empty (in academic content) course in my school experience. Best regards, Simon Hudon ETH Zürich July 28th 2010 Post Scriptum: I am still studying for the same exam and I realize that the slides are really full of motherhood statements [1] like "Repeat the experiments until you get a reasonable standard deviation". I'm not sure how usual this is though but I realize this is something that always annoys me when I see it anywhere. [1] "motherhood statements" is an expression I borrowed from Dijkstra, it designates a set of argumentative positions and advices to which it does not make sense to be opposed to. In the case of the analogy, if you don't qualify your statement, you cannot just say that you support motherhood because the opposite does not make sense. Meyer also calls them argumentative platitudes. end of post scriptum

Wednesday, July 7, 2010

About the Analogy between Verification and Compilation

Some people in the verification community have started responding to the complaint that formal proofs of program correctness are too long by making an analogy between verifiers and compilers:
Formal proofs of correctness are tedious in the same way coding in assembler is tedius. The use of tools, namely high level programming languages, solved the latter, and it can also solve the former.
I've seen it many times and I don't really know whom to credit for it. I think the analogy is fundamentally flawed. For one thing, the translation of high level programming language into assembler is a computable problem and its use is rather straightforward if the language is well designed and its semantics is as simple and systematic as it can be. We don't have any such luck with automated theorem provers. Since their job is to solve an uncomputable problem, they have to use lots of heuristics which make the interface with the user fuzzy. When submitting formula to the theorem prover, there is no way of saying if it will be found to be a theorem or not because the class of recognized theorems is not precisely defined. On the other hand, given a theorem (in our case the conformance of a given program to a given specification) the problem of generating a proof is well know to be uncomputable. But this should not deter us in seeking a formal backing for our design because, contrary to the folklore, writing a formal proof is not a tedious activity. It is hard but it is a very inventive process. It can become tedious though if one uses an inappropriate logic like, say Gentzen's Natural Deduction or like the sequent calculus. Although they are popular formal logic system, they are hardly the only ones available. It is not too say that they are useless formal systems, just like Turing Machines are not useless. They are not meant to be used but meant as a simplification for logicians and theoreticians to understand what tasks can be accomplished with the formal system. It is not proper for understanding how best to accomplish the tasks. Just like programming directly with Turing machines is a waste of effort, proving with Natural Deduction is also a waste of effort. It forces the user of making irrelevant distinctions, for example, with its rule for the proof of logical equivalence. The only way to prove it or even to use it is as a shorthand for two implications. Very often, you can exploit properties of logical equivalence in the calculations of theorems (like it is done in [0] and [1]) in at most half of the effort (and some times one fourth or less) that would be expended for the same proof in Natural Deduction. Contrarily to Natural Deduction, the use of calculational proofs allows one to focus on the subject matter and use the activity of proving as an improvement of his understanding rather than as a necessary evil for being certain of one's guesses. As a consequence, automating the construction of the significant proofs of correctness does not improve the productivity but decreases it because it removes any useful hints that the (human) prover could use to orient his efforts in the right direction. In short, in all important respect, the analogy is invalid and cannot be used to the defend the necessity of automating theorem proving. It is so first because, of the two compared activities, only one solves a computable problem, and, second, one of them is tedious and repetitive whereas the other involves a lot creativity. The fallacy comes from talking some properties of inappropriate formal systems as being valid for all of them. To be effective at proving, we don't only need a logic which is sound and theoretically powerful enough, we need a formalism suited as a carrier for reasoning. References [0] On the Shape of Mathematical Arguments, A.J.M. van Gasteren [1] Predicate Calculus and Program Semantics, E.W. Dijkstra and C.S. Scholten

Monday, July 5, 2010

Quality Criteria for Writing

I have been less active in writing lately, and lots of explanations are possible. I possibly have ranted to my heart's content about my situation in Zürich and the sins of the industrial and academic communities but that's probably overly exaggerated. As I discuss with people, try to be more nuanced in my opinions, I find new and interesting faults to describe, some of which I have started an entry about and never finished.
I stumbled upon EWD 1068 today which I had never read and which makes me presume that I might have mispostulated (please accept the offering of that verb) my audience as not being curious or sympathetic. In any case, I should know better than to think that people interested in my fabulation are not curious; this is very rarely a step by step tutorial to do anything in particular.
I find that, mostly, I feel myself shackled by Blogger. Whenever I find a very beautiful proof that I would like to share on my blog, half an hour to an hour of struggling with the layout dissuade me of sharing that kind of logical poetry. That is really a shame and I am yet unsure about what I'm going to do about it.
In any case, enjoy the reading of EWD 1068, I found it quite interesting.
Simon Hudon
ETH Zürich
July 5th, 2010

Wednesday, June 2, 2010

Some Light Technical Reading

I realize that I haven't posted anything in a while and as an apology I will explain to my readers (all four of them!) that I have several post in a state of a draft which I cannot bring to a satisfactory state. To show that I am still alive and interested in writing I offer you a reference to a proof I read today. I wanted to do the same thing but the solution is really delightful and I didn't think I could top it. So without further ado, here is an essay from Dijkstra's EWD series:
(for the interested, all or most of his essays can be obtained here:
Simon Hudon
Wednesday, June 2, 2010
ETH Zürich

Saturday, November 28, 2009

Back in the 60s

Since the beginning of Computing Science and computer programming, there has been an important evolution in the methods used to design software. We're now much better than we used to be to produce software. When I turn around and look at the Internet trend though, I have the impression of looking at a software seen of at least 40 years ago. Whereas people are now capable of choosing a programming language on account of the abstractions it allows them to formulate and use for desktop applications, it seems that web programming is still tightly tied with the individual idiosyncrasies of the browser that will be used to run them. Whenever I complain about the hazards of using Javascript, I get the answer that this is what gets executed in the browsers! Since I'm very enthusiastic about the research that I do, I don't usually refrain to mention the sorry state of affair of software and the thought that formal methods can be of great use for that. One year ago, I met a friend of a friend who is enthusiastic about web programming. When I suggested that we should be very careful about the properties that we select for our programs and the assumptions that their validity rely on, he pointed out that would be of very little help for web programming because of the important differences that exist between the browsers. Continuing in that direction, I got the strong feeling that he considered the variety of platforms to be one of the most important and most interesting challenges of web programming. I could not help but be remembered the accounts of the software scene of the 60s that I have read. The accidental complexity is so important that people mistake it for the core of the problem. In that respect, I guess this is no surprise that web applications are of such a poor quality. What triggered the present note is my suffering from the poor quality and and random behavior of Facebook, Google Wave and, while I type, I am reminded that Blogger is not much better. A final word, lest I am mistakenly assumed to be satisfied with the state of the art in non-browser based application: they appear much better in comparison to web application but there are important shortcomings also and I think that they are part of the problem of the web scene. Simon Hudon November 28th Meilen

Wednesday, November 18, 2009

More of the Same

I just submitted a post on language design and I feel that this one is closely related. I just attended a talk in the context of my software verification course. We are now studying techniques related to data flow analysis and we had a speaker present us how he used such techniques to measure the quality of a test suite. A priori, I am not very much into testing but still, I think there is something to be done with the topic. As far as I know, the only research done in that respect that treats testing in a decent way consider the specifications even though it also uses the program's structure as a guide for testing. Now that I think of it, this is the shortcoming that ticked me off the most. It was based on Java code and there was no hint of suggestion that considering an invariant or any other statement of abstract properties of the data or of the program would enhance the assessment of the tests suites. The code was taken as it stands, assuming that the mind of the programmers is impenetrable. To avoid repeating myself, I will simply say that the overwhelming feeling I got while listening was: "this advanced computing theory of obsolete programming techniques".
I have no difficulty explaining its survival though: it does not suggest the need for education for anybody and produces a tool which can be used without thoughts. In other words, it's a fancy way of patting on back the industrial managers and the programmers alike. Nothing is more welcomed than being told that you're doing a good job by an automated tool. In my eye, this is just more of the same.
Simon Hudon
Zürich
November 20th, 2009