Monday, July 7, 2014
Where have I been?
I underwent a peculiar transformation since I moved from Zürich to Toronto. I used to think that you can study a software system purely from a logical point of view and, when you proved all you needed to prove, all is left to do is run the software. I have done it for some non-trivial software system but I realize that, for a bigger piece of software, the amount of work to cover the whole thing would be excessive both in the sense of "exceeding necessity" and "exceeding ability to do so". I'm reworking my vision of software design as an activity that includes proving theorem but which also comprises other techniques of validation. For example, random testing as done by QuickCheck in Haskell is often a good compromise. The key is to choose which technique applies best.
For instance, I started developing a program one year ago and it has now become relatively big (considering that I'm the only person developing it) and, to my surprise, it works well even though I didn't prove a single property about it. In the back of my mind however, I keep pulling some parts of the design aside and figuring out the logic that would make it possible to prove that it is correct. One of these days, I'll prove the correctness of some crucial parts of my programs and I think that will be both really interesting and really useful -- especially considering that the program is a formal verifier.
On the other hand, I am also discovering the interplay between mastering the theory and having good software tools for designing software. I used to think that if you know the theory well, you have no need for tools. I'm realizing that this is a pretty silly point of view: the program that I mentioned above is specialized in writing proofs for simple conjectures. It very incredibly fast. In one of the small systems that I use that tool for, there are around 500 conjectures that need to be proved. Most of them are incredibly easy to prove but they still need to be proved. My tool takes about 30s to prove 499 of them leaving me with the truly difficult theorem. That's incredibly helpful because now, I can focus my attention on the hardest of the hard design problems. The best part is, compared to an interactive prover, I do not need to manually invoke the prover on each proof obligation. They will all be attempted automatically except for those with a proof provided by the user.
The tool is designed to allow me to verify Unit-B models. I don't think I would be able to use Unit-B as confidently as I do and as efficiently as I do without it. Even better, when using Unit-B to develop the very first examples, every time I was faced with a problem, I could improvise a solution and think really hard and figure out that it was valid. This is very inconvenient to teach the method because I hardly have a list of all the tricks that I use. By implementing a verifier, I need to identify the truly useful techniques and make sure that they can be used with confidence and have a clear explanation to disallow whatever I choose to exclude. In other words, I'm now forced to formulate the rules of the method independently of any one example.
So much for simplistic views of the world.
Tuesday, February 28, 2012
My Master Thesis
For a long time, I have wanted to talk about discrete transition systems as the main abstraction of program execution. People use such abstractions for two reasons: producing theoretical results and reasoning about their programs. Although I don't have problems with the first one (I don't quarrel with people looking for theoretical results, I think it's a valid goal) I am looking to reason about my programs to document them with a proof of their correctness.
I always had a hard time to choose which part to start with and I think I got something nicely simple to talk of Event-B, UNITY and, my own creation, Unit-B. It focuses on the achievements of my master thesis but it should already make sense for anyone who is not my supervisor and who does not have any experience with any of the mentioned formal methods.
The goal of my master thesis was to design a formal method for developing correct parallel programs. The result is Unit-B, a formal method inspired Event-B and UNITY.
Using Unit-B, one can design a concurrent program by refining, step by step, a specification into a program in a similar fashion as in Event-B. However, while refinement in Event-B only preserves safety properties, refinement in Unit-B also preserves liveness properties. This means that, once it is established that a specification S satisfies a progress property P, all specifications and programs refining S will also satisfy P. It ensues that, as opposed to Event-B, it is possible in Unit-B to introduce progress properties in a specification along the refinement process exactly at the time where they make most sense rather than at the very end of a development.
The technique has two benefits. First, in Event-B, when one tries to establish a liveness property, it is often too late: while trying to make the system safe, one has removed any possibilities for the system to perform a useful function. Such is not the case with Unit-B. By tackling progress properties before the related safety properties, we eliminate the possibility of addressing safety concerns by being too conservative.
Second, by putting the requirement for progress as the center of attention in the process of refinement, the specification ends up shaping itself around the progress property. In other words, the concern for progress dictates the skeleton of a valid program. After that, the concern for safety can be seen as putting the flesh on the bones.
Compared to UNITY, this approach has the benefit of making the transition from specification to program smoother. Since a program is just a special kind of specification in Unit-B, the suitable application of refinement to specification will slowly give birth to a program. In UNITY, a specification and a program are two completely different things making the refinement of a specification and the creation of a program two distinct activities.
From UNITY, Unit-B takes the temporal logic and the fairness assumption in the execution of its transition system. Fairness facilitates the mapping between progress properties to transition systems and the temporal logic is tailored to express simply properties that can be mapped to programs.
The semantics of Unit-B is defined using the R.M. Dijkstra's computation calculus which allowed me to be very fine-grained in the choice of and in the proof of correctness of the refinement rules.
Cheers,
Simon Hudon
York U
Toronto, Canada
February 28th 2012
Tuesday, July 5, 2011
On Function Purity and Simple Framing
Here is the latest blog post by Bertrand Meyer.
http://bertrandmeyer.com/2011/07/04/if-im-not-pure-at-least-my-functions-are/It talks about the problem of function purity and broaches the subject of framing. In short, the frame of the specification of a routine is the set of variables (in the broad meaning of the term) that the execution of the routine can modify. In object oriented programming, it is especially tricky because of the sets of dynamically allocated objects of which it might be necessary to alter the state in a given routine. Indeed such set can be taken to be the set of objects reachable from a given root. In a limited solution, such as shown by Meyer in his latest post, you only name variables and expressions that can change. In more elaborate solutions, such as the dynamic frames of Kassios (as quoted by Meyer), it is possible to provide a set of objects and the attributes that can be expected to change. It is especially handy for specifying dynamic data structures like linked lists.
Such a solution seems to present the disadvantage of disallowing the extension of frames through inheritance, which Meyer preserves. I think a compromise can be reached between the two but it is a bit too involved for a post only intended for sharing another blog post.
Tuesday, June 21, 2011
Why I Prefer Predicates to Sets
Various recent comments by others have made me realize that it is about time I write a comparison between sets and predicates.
One comment that springs to mind is a response I got when I said that I almost never use sets, that I prefer predicates. I was asked what was the point to choose since predicates can be represented as sets: a predicate can be seen as a boolean function which in turn can be seen as a functional relation which is basically a set of tuples. More directly, for every predicate you can associate a set which contains exactly the objects which satisfy the predicate. To which I have to add that for each set, there exist the characteristic predicate which is satisfied exactly by the object contained by the set.
What we just established is the semantic equivalence between the two. Such equivalence is less important than it might seem. From a logician's point of view, the equivalence tells us that for each theorem of predicate logic, there should be an equivalent theorem in set theory.
From an engineering and scientific point of view, that is from the point of view of someone who wants to use either of these tools, there is an important difference, one of methodological importance, which, understandably, is irrelevant for logicians.
The difference lies in the flexibility of the related notation. It is hence of syntactic nature. If we want to let the symbols do the work --and I do--, we have to make sure that we can find solutions in the simplest possible way.
One way of doing this is to avoid spurious distinction, i.e. to preserve symmetry as much as we can. This is where predicate calculus [0] is superior to set theory. If we start by encoding set theory into predicate calculus, we will see, little by little, a fundamental difference emerge. In the following, capital letters will stand for sets and their corresponding predicates will be the same letter in lower case. As an example:
(0) x ∈ P ≡ p.x for all x
with the dot standing for function application and ≡ for logical equivalence [1]. It relates a set to its characteristic predicate.
Note:
Please notice that the convention of upper cases and lower cases is nicely symmetric. We could have defined characteristic predicates of a set as cp.S or the set of a predicate as set.p but we chose not to in order to preserve the symmetry between the two. Instead of saying that one is the characteristic predicate of the other, we say that they correspond to each other and it does not have to appear everywhere in the formulae where they appear.
(end note)
Notation
Here is a little bit more of syntax before I start the comparison. I take it mostly from [DS90].
If we want to say that a predicate p holds exactly when q holds we can express it as follows:
〈∀ x:: p.x ≡ q.x〉
It is a bit too verbose for something that is supposed to relate p and q very simply and, for that reason, we introduce a shorthand. Instead of quantifying over x and using x everywhere in the formula, we will have an operator that will say: this predicate holds everywhere:
[p ≡ q]
It is more concise and it is an improvement. It might seem like cheating though unless we know of the notion of pointwise extension of operators. For instance, ≡ applies to boolean operands but we allow ourselves to apply it to compare boolean valued functions too. This is how it then behaves:
(1) (p ≡ q).x ≡ p.x ≡ q.x for all x
We implicitly postulate that all logical connectors apply to boolean-valued functions. If we add one more definition, the previous universal quantification can be calculated to a form with an everywhere operator.
(2) [p] ≡ 〈∀ x:: p.x〉
〈∀ x:: p.x ≡ q.x〉 = { pointwise extension } 〈∀ x:: (p ≡ q).x〉 = { (2) with p := (p ≡ q) } [p ≡ q]
Translating Sets
We can start our comparison by translating sets into predicate formulations. We will concentrate on set equality, subset, intersection, union, subtraction and symmetric difference.
(3) P = Q ≡ [p ≡ q] (4) P ⊆ Q ≡ [p ⇒ q] (5) x ∈ P ∩ Q ≡ (p ∧ q).x (6) x ∈ P ∪ Q ≡ (p ∨ q).x (7) x ∈ P \ Q ≡ (p ∧ ¬q).x (8) x ∈ P △ Q ≡ (p ≢ q).x
with △ being the symmetric difference, the set of elements that are present in one set and ≢ being the discrepancy, the negation of ≡. We see that the first operators are translated into a form with one logical connector and the everywhere operator. Also, no operator is built uniquely around either of the implication or the equivalence. Therefore, there is not a unique translations of expressions like the following:
[p ≡ q ≡ r ≡ s]
To allow us to manipulate the above, we introduce a definition for set equality. It can be calculated from our previous postulates.
(9) P = Q ≡ 〈∀ x:: x ∈ P ≡ x ∈ Q〉
[p ≡ q ≡ r ≡ s] = { double negation } [¬¬(p ≡ q ≡ r ≡ s)] = { ¬ over ≡ twice, see (10) below } [¬(p ≡ q) ≡ ¬(r ≡ s)] = { (11) } [p ≢ q ≡ r ≢ s] = { everywhere expansion (2) } 〈∀ x:: (p ≢ q ≡ r ≢ s).x〉 = { pointwise extension of ≡ } 〈∀ x:: (p ≢ q).x ≡ (r ≢ s).x〉 = { (8) twice } 〈∀ x:: x ∈ P △ Q ≡ x ∈ R △ S〉 = { (9) } P △ Q = R △ S
(10) [¬(p ≡ q) ≡ ¬p ≡ q] (11) [¬(p ≡ q) ≡ p ≢ q]
Of course, for someone used to it, this is a simple conversion and there would be no need to be this explicit but I include it nonetheless for the sake of those who are not used to it.
Our formula is therefore proven to be equivalent to the one below.
P △ Q = R △ S
we could also introduce the negations elsewhere and get:
P = Q △ R △ S
Since equivalence is associative -- that is: ( (p ≡ q) ≡ r ) ≡ ( p ≡ (q ≡ r) )-- and symmetric --that is: (p ≡ q) ≡ (q ≡ p)--, there are a whole lot of other formulae from set theory that are equivalent to the predicate calculus formulae we showed. All those results create many theorems which basically "mean" the same thing but that one may have to remember anyway. As a comparison, our one predicate can be parsed in various ways and we can use it to substitute p ≡ q for r ≡ s or r for p ≡ q ≡ s to name just two usages of a simple formulae.
One could suggest that the problem lies with the usual reluctance of logicians to talk about sets complement, that if we had an operator, say ⊙, to stand for the complement of △ --which can be seen as problematic because it creates a set with elements taken from neither of its operands--, we wouldn't need to introduce spurious negations. The above formula could then be translated in:
P ⊙ Q = R ⊙ S
It remains to choose which equivalence becomes an equality and which ones become the complement of a symmetric difference. I believe the biggest problem to be that set equality mangles two orthogonal notions together: boolean equality and (implicit) universal quantification. The same goes for implication.
In short, this is one example where set theory destroys a beautiful symmetry and introduces spurious distinctions that pollute one's reasoning rather than help it. Whenever I have problems to solve and that calculations can help me, in the vast majority of cases, I use predicate calculus rather than set theory.
Simon Hudon
April 10th, 2011
Meilen
Notes
[0] I use the calculus, not the logic. The difference is that the logic deduces theorems from other theorems whereas the calculus simply helps us create relations between formulae and find new formulae from old ones whether they are theorems or not.
[1] Like many others, I have forsaken the double arrow for that purpose. The choice can be traced back to [DS90] and [AvG90].
References
[AvG90] Antonetta J. M. Gasteren. 1990. On the Shape of
Mathematical Arguments. Springer-Verlag
New York, Inc., New York, NY, USA.
[DS90] Edsger W. Dijkstra and Carel S. Scholten. 1990. Predicate
Calculus and Program Semantics. Springer-Verlag New
York, Inc., New York, NY, USA.
Tuesday, June 14, 2011
"Lifting weights with your brain"
I am posting the following just for the sake of sharing a quote I liked in an essay I'm reading:
In workouts a football player may bench press 300 pounds, even though he may never have to exert anything like that much force in the course of a game. Likewise, if your professors try to make you learn stuff that's more advanced than you'll need in a job, it may not just be because they're academics, detached from the real world. They may be trying to make you lift weights with your brain.
By Paul Graham, Undergraduation. I don't suppose it will completely eradicate the comment "What good does it do me to learn [put a beautiful piece of theory, technology, etc here]? People don't use it." Another reply of course might be to point out that "maybe they should".
Simon Hudon
June 14th, 2011
Meilen
Monday, June 6, 2011
Decidability
The proof of the undecidability of the halting problem has always fascinated me. It is a formidable result to behold, at least the first time around. It relies on the construction of a program that cannot exist because of a contradiction drawn through self-reference. Here is an interesting rendering of the argument by Christopher Strachey: An Impossible Program.
However, an interesting paper written by Eric Hehner (Problems with the Halting Problem) sheds doubt on the correctness of the usual proof. He proposes that the contradiction in the proof is due to self reference rather than because of the structure of the halting problem. It is a good read and I strongly recommend it.
Simon Hudon
Meilen
June 6th 2011
Monday, May 9, 2011
A Lecture Supported by an Automated Theorem Prover
UPDATED: added note [1] on mutual implication
Last week on monday, I attended the guest lecture that Prof. Topias Nipkow from the Technical University of Munich (TUM) gave at ETH. I had heard of him before because of his work on the Isabelle theorem prover. An interesting thing before I start: he refers to it as a proof assistant.
I didn't expect much of this talk because of my reluctance to rely on automatic provers to do my job and because I expected a very technical talk about proof strategies implemented or to be implemented in Isabelle. Since such issues need concern only people involved in the construction of theorem provers and that I'm not, I didn't think it would appeal to me.
I was pleasantly surprised that he had chosen to talk about a course he had recently started to give on the topic of semantics. Said course was noteworthy because formal methods and an automated prover were used as teaching vehicles. I am pleased to see that he decided to use formal methods as a tool for understanding the topic and that he would opt for spending more time on semantics, the subject matter, and less on logic and formal methods.
It is my considered opinion, however, that a prerequisite course on the design of formal proofs would be most useful, even necessary. I draw this conclusion because I believe that acquiring effective techniques for designing elegant proofs can help tremendously in making hard problems easier. But, since he didn't want to teach logic, he gave very little attention to the design of proofs and adopted "validity of the proofs" as the goal the students' had to reach in their assignments. It was asked --by someone else before I could ask it-- what importance he gave to the style of the submitted proofs. He said none because some (in his opinion) very good students had an awful style and he didn't want to penalize them for it. He considered that, having submitted a correct proof, the student had shown that he understood. I would say that, in doing so, he was doing his students a disservice. However, I can now better understand his position because the underlying assumption is very popular: since words like style and elegance have a very strong esthetic connotation, it becomes automatically whimsical to judge them or to encourage the students to improve upon them. After all, we're here to do science, not to make dresses!
*
Even when it has been successfully argued just how useful mathematical elegance can be, people keep opposing its use as a yardstick on the ground that it is too subjective and enforcing one standard would be arbitrary.
It turns out that, not only are elegance and style very useful, but they can be very objectively analyzed and criticized. It has been the preferred subject of Dijkstra in the second part of his career and it appears almost everywhere in his writing --see EWD619 Essays on the nature and role of mathematical elegance for such an exposition [0]--.
The usefulness of an elegant style come from the fact that it yields simple and clear proofs requiring very little mental work from the reader (which is not something to be underestimated) but, more importantly, it allows the writer of a proof to be economical in the use of his reasoning abilities. This can make the difference between a problem which is impossible to solve and one which is easily solved. On the other hand, in such a course as what Nipkow has designed, the most interesting things for students to learn are not the individual theorems that they proved but the techniques that they used to find a proof. If the techniques are not taught, it can be expected that the skills the students acquire will be of much less use when confronted to different problems.
What an elegant style boils down to is, to put it in Dijkstra's words, the avoidance complexity generators and the separation of one's concerns. He also argued that concision is an effective yardstick to evaluate the application of those techniques. Indeed, concision is effectively achieved by both separating one's concerns and avoiding complexity generators. The most well known complexity generators are case analyses, proofs of logical equivalence by mutual implication and inappropriate naming. The first two are legitimate proof techniques but their application double the burden of the proof. It doesn't mean that they should never be used but rather that they should be avoided unless the proofs of the many required lemmata differ significantly. [1] I say significantly to stress that, in some cases, they can differ in small ways and, upon closer scrutiny, the differences can be abstracted from.
The issue of naming is also closely related to the separation of one's concerns but, being a tricky issue, I would rather point the reader in the direction of van Gasteren's book "On the Shape of Mathematical Arguments" to the chapter 15 "On Naming" which covers the subject very nicely. Since Dijkstra was van Gasteren's supervisor, he wrote the chapter with her and it has earned it an EWD number: 958 [0]. This allows me to skip directly to the matter of separation of concerns.
*
While debating about concision and about Nipkow's lecture with my supervisor, something kept coming up in his arguments. I was favoring short formal proofs and he kept asking: "What if a student doesn't want to call 'auto' [the function of Isabelle which can take care of the details of a proof step] but wants to go into the details to understand them?" First of all, I have to point out that being the input for an automated tool doesn't relieve a proof from the obligation of being clear. [2] Unlike Nipkow, I see that the proof must include intermediate formulae accompanied with a hint of how one proceeds to find them. This would correspond to the combination of what he calls a proof script --a series of hint without intermediate formulae-- and a structured proof --a series of intermediate formulae with little to no hints; this is what he prefers to use--. In that respect, 'auto' is no better than the hint "trivially, one sees ...". The choice of how early one uses 'auto' is basically related to decomposition. It is unrelated to the peculiarities of the prover but relies on how clear (to a human reader) and concise the proof is without the details. What my supervisor and Nipkow name "using auto later in the proof" would be, in a language independent of the use of an automated prover, including more details. If the proof is clear and concise without those details, they don't belong in the body of the proof. It is that simple. One could, however, include them in the proof of a lemma invoked in one (or many) proof steps of the main proof.
By Including a lemma, one doesn't destroy concision because the proof of said lemma can be included beside that of the main theorem rather than intertwined with it. The difference is that, while one reads a proof, each step must clearly take him closer to the goal. Any digression should be postponed so as not to distract the attention from the goal. One good indication that a lemma is needed is when many steps of a proof are concerned with a different subject than the goal. For instance, when proving a theorem in parsing theory, if many successive steps are concerned with predicate calculus, the attention is taken away from parsing. Instead, making the whole predicate calculation in one step and labeling it "predicate calculus" is very judicious. Nothing prevents the proof that pertains to predicate calculus to be presented later on, especially if it is not an easy proof.
The important point here is that, sticking to the subject at hand doesn't mean forgetting that there are other problems that need one's attention. It means dealing with one problem at a time each time forgetting momentarily what the other problems are. This is exactly what modularity is about.
Furthermore, with an automatic prover, nothing prevent someone to use 'auto' to commit the sin of omission; details that would make a step clear are then missing. It is then a matter of style to judge how much should be added. This reinforces my point that good style should be taught because clarity is the primary goal with proofs [3].
With respect to such a tool, I would welcome the sight of one where keywords like 'auto' are replaced by catchwords like 'predicate calculus' to hint at the existence of a simple proof --at most five steps-- in predicate calculus --in this case-- that supports the designated step. More often, we could use invocations like 'theorem 7' (or '(7)' for short) or 'persistence rule' as a way of invoking a very straightforward application of a referenced theorem. It is clear to the reader what is going on then and to the prover, the problem is very simple: it looks for a simple proof. If no proofs of at most five steps exist, the search fails. More importantly: the user should have foreseen it. The prover should never be used to sweep problems under the carpet.
Like with the type system of a programming language, it should be easy for the human reader to see that a given proof step is going to be accepted. It is by being predictable that automatic tools are useful, not by working magic.
*
By way of conclusion, I go on to another aspects of Nipkow's talk. He said that applying formal proofs to the teaching of computer science is especially useful in those subjects where the formalization is close to people's "intuition". In the rest of the subjects, it is a bad idea. I say this is a drawback of his approach, not one of formal proofs. If you take formal proofs as a proper input format for the mechanized treatment of intuitive arguments, it seems inevitable to run into that problem. However, formalism can be used for purposes which have nothing to do with mechanization: the purposes of expressing and understanding precise and general statements. If it is used in this capacity, formalism allows a (human) reasoner to take shortcuts which have no counterpart in intuitive arguments. It is one of the strengths of the properly designed formalisms that you can use them to attain results which is beyond intuitive reasoning.
Case in point, the topics where Nipkow says formal proofs are not an appropriate vehicle for teaching are those where it would be crucial to rely on formalisms that are not merely the translation of intuitive reasoning. To use those, we would have to stop relying on our intuition and acquire the techniques of effective formal reasoning. This leads me back to my first point. 'Effective' means that we're not interested in finding just about any proof: we are looking for a simple and elegant one so that problems for which the solution would be beyond our abilities could admit a simple solution.
In other words, striving for simplicity is not a purist's concern as much as a pragmatic preoccupation that allows us to solve with as little efforts as possible problems that are beyond the unaided minds.
Simon Hudon
ETH Zurich
Mai 11th 2011
[0] For the EWD texts, see http://www.cs.utexas.edu/users/EWD/
[1] For those who are used to using deductive systems for formal reasoning, the question "what alternative do we have to mutual implication?" might have come up. The answer is that equivalence is a special case of equality and should be treated as such. That is to say that its most straightforward use is by "substituting equals for equals" also known as "Leibniz's rule". It is also the most straightforward way to prove an equivalence. The naming of equivalence by "bidirectional implication" is a horrible deformation. It is analogous to calling equality in numbers "bidirectional inequality": it hints at a way of proving it using implication but does not distinguishes between this shape of one possible proof and the theorem and, indeed, I realize that some people immediately think of mutual implication when they see an equivalence. It's a shame.
[2] In this respect, theorem provers seem to be more primitive than our modern programming languages. Whereas they become more and more independent of their implementation to embody abstractions (for instance, in Java, there no longer is a "register" keyword for variables), the proof languages of automated provers shamelessly include a lot of specific command for what I shall call "sweeping the rest of the problem under the rug". In proofs explicitly constructed for human readers, those would be replaced by vague expressions like "well you know..." followed, if presented in a talk, by a waving of the hands intended to say "anyone but idiots will get this".
[3] This goes against what seems like a school of thought that views formal proofs as the input of tools like theorem provers. It is easy for people of that school to draw the fallacious analogy with assembler programming. The important difference is that a proof designed with style can be the vehicle of one's understanding. Since mechanisms like abstraction are routinely applied to make proofs as simple as they can be, if one wants to understand an obscure and counter intuitive theorem, an elegant formal proof is very likely to be the best way to do it. On the other hand, assembler is not a language which helps one understand an algorithm. It is clearly an input format of a microprocessor and people should treat it as such.
Special thanks to my brother Francois for applying his surgical intellect to the dissection of the objections made against the notions of elegance and style and to Olivier who also help me improve this text.