Tuesday, November 9, 2010

On the Notion of Ghost Variables

When one reads the literature of formal methods, especially that of formal verification, one encounters the notion of ghost variable also called model variable. In short, they are variables added to a program for the sake of verification to express simply and concisely what is the data abstraction a certain module is supposed to be implemented. They reflect very well the a posteriori nature of verification because they add the variables to a presumably existing program. They are characteristically absent from the regular data flow and control flow and, as such, their values need not be stored during the execution of the program. They are mere abstractions.

The naming seems to come from the fact that the predominant role the verification community gives to a program variable is to be an abstraction for the storage of a category of important values. A variable for which the values need not be stored then feels kind of odd: it looks like a variable but it's not really one. Hence the special attribute is added to warn anyone reading the program that this is a strange sort of variables.

We can take a different view with respect to variables though. If we try to understand a problem to then create a program to solve it and a compelling argument for trusting the results of the latter, we need a way to speak of the problem. Namely, it is important to identify the central abstractions, the objects of central importance and give them a name so that their properties of interest can be spelled out using that name rather than mainly in terms of their relation to other objects. One such example is given by:

Jim and Julia's brother play football together. During one match, Jim tackled Julia's brother and hurt him badly.

Compare it to:

Jack and Julia are siblings. Jim and Jack play football together. During one match, Jim tackled Jack and hurt him badly.

Notice how disentangled the second version of the story is. Every sentence states a fact and, if we are interested in the accident, we realize that Julia has very little role to play in it and the first sentence can be ignored altogether without missing any information. In the first version of the story, Julia and her relationship to one of the players is dragged all through the story although, as far as we can see, she plays no role in it herself. More about the issue of naming in mathematical arguments (which is relevant in software design) can be found in [0], in chapter 15 which is dedicated to the subject.

The point I am coming at with all this rambling on naming is that, during the analysis of a programming problem and the design of a solution, the introduction of variables should be seen as the introduction of names for the relevant quantities. The fact that the binary representation of the exact set of variables introduced should not be of immediate concern. A confidence that an efficient representation can be found later on is healthy but anymore thoughts about them should be postponed. What Dijkstra calls a change of coordinates in [1] can be applied later to improve efficiency. In the mean time, the prime concern is to establish sufficient conditions for an implementation to be correct. It is also possible that the set of variables chosen during the design process don't have to be transformed in order to get an efficient implementation and, again, it is of no concern while we come up with them.

When I say those problems are of no concern, it means we should concentrate on them but also, we should not let the vocabulary that we use reflect a distinction that we are not ready to make. For that purpose, I would abandon the notion of model or ghost variables or, at least, downplay there role considerably and introduce the notion of stored variables. Those are the variables of the final implementation for which the value has an impact over the control flow and the data flow directed to the output variables. By symmetry, we can call the others model or ghost variables but, since we reduce their importance to the final implementation stage and that, even there, they are the ones to be left out rather than kept, omitting to name them would not be a big methodological impediment.

Regards, Simon Hudon ETH Zürich Tuesday the 9th of November 2010

References

  [0] On the Shape of Mathematical Arguments, A.J.M. van Gasteren   [1] EWD1032 - Decomposing an Integer in a Sum of two Squares, E.W. Dijkstra, available at here

Monday, October 18, 2010

The Segment of Maximum Sum

This is the recording of a solution found together with Bernhard Brodowsky where he did most of the work in one of the first lectures where I explained the techniques to him. The problem is to find a segment of an array for which the sum of the elements is maximal. We stopped when we had a simple specification of the body of the loop for fear that actually calculating the assignments would be too much tedium. In this recording, I will calculate them to see exactly how tedious it is. Also, we did not pay any mind to calculating too many auxiliary values --that is using them before updating the variable holding it in the loop body so that one is always thrown away--. In this recording, the solution will be changed slightly to avoid that.

First, here is the postcondition that we want to implement:
 P0: r = 〈↑ p, q: 0 ≤ p ≤ q ≤ N: 〈∑ i: p ≤ i < q: X.i〉〉
where ↑ denotes the maximum of two operands. Like the conjunction and the disjunction, it is idempotent, associative and symmetric so we can use it in the same was as the universal and existential quantification. The first thing to do is to introduce a name for the summation part so that we can manipulate it independently of the overall specification.
 (0) S.p.q = 〈∑ i: p ≤ i < q: X.i〉
And the specification becomes:
 P1: r = 〈↑ p, q: 0 ≤ p ≤ q ≤ N: S.p.q〉
The traditional way to get an invariant from this is to replace N by a variable, let's call it n, and use n = N as an exit condition:
 J0: r = 〈↑ p, q: 0 ≤ p ≤ q ≤ n: S.p.q〉 
 C0: n = N
By symmetry, we can assume that 0 is a proper initial value for n which makes S.0.0 a good one for r.
 I0: n = 0
 I1: r = 0
Last formality: having n increase by one at each iteration will guarantee termination provided 0 ≤ N.
 S0: n' = n + 1
Let's now start calculating with J0 to see what adjustments we have to make to r to preserve it. Two distinct heuristics get us to start calculating with the right-hand side of the equality: it is much more complicated than the left-hand side so we can assume that most of our moves will be eliminations and all the program variable involved in the right-hand side don't have a matching assignment so we would basically be stuck right at the start. For a given loop with invariant J and body S, the invariance of J is assured by:
 J'  ⇐  J ∧ S
where J' is J with all its program variables primed. It means we can use whatever is part of the body or the invariant to prove J's invariance, to do so, we can invent new statements for the body as we go.
    〈↑ p, q: 0 ≤ p ≤ q ≤ n': S.p.q〉 
  =   { S0 }
    〈↑ p, q: 0 ≤ p ≤ q ≤ n+1: S.p.q〉 
  =   { Split range }
      〈↑ p, q: 0 ≤ p ≤ q ≤ n: S.p.q〉
    ↑ 〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉 
  =   { Use final value of a new variable using
         Q0: s' = 〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉 and J0 }
    r ↑ s'
  =   { New statement: S1: r' = r ↑ s' to conclude
         the proof of J0's invariance }
    r'

 Q0: s' = 〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉
 S1: r' = r ↑ s'
Q0 links unprimed variables with primed variables in a way that cannot be evaluated. Let's turn it into an invariant to fix this.
    〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉
  =   { S0 }
    〈↑ p: 0 ≤ p ≤ n': S.p.n'〉
  =   { J1' }
    s'
where
 J1: s = 〈↑ p: 0 ≤ p ≤ n: S.p.n〉
We now have to make sure that J1 is also invariant and use the same heuristic as for J0 to do so.
    〈↑ p: 0 ≤ p ≤ n': S.p.n'〉
  =   { S0 and split }
    〈↑ p: 0 ≤ p ≤ n: S.p.(n+1)〉↑ S.(n+1).(n+1)
  =   { (1), see below ; S.n.n = 0 }
    〈↑ p: 0 ≤ p ≤ n: S.p.n + X.n〉↑ 0
  =   { + over ↑ }
    (〈↑ p: 0 ≤ p ≤ n: S.p.n〉+  X.n) ↑ 0
  =   { J1 }
    (s + X.n) ↑ 0
  =   { New statement: S2: s' = (s + X.n) ↑ 0 }
    s'

 S2: s' = (s + X.n) ↑ 0

 (1) S.p.(n+1) = S.p.n + X.n   ⇐   p ≤ n
(1) is a theorem that we calculated because we had a term S.p.(n+1) that we needed to replace by one formulated in terms of S.p.n to make it possible to use J1 to eliminate the quantified maximum. Proof of (1):
    S.p.(n+1)
  =   { (0) }
    〈∑ i: p ≤ i < n+1: X.i〉
  =   { Split range using p ≤ n }
    〈∑ i: p ≤ i < n: X.i〉+  X.n
  =   { (0) }
    S.p.n + X.n
end of proof All that is missing now to have a complete program is to add an initialization for s. An examination of J1 tells us that for n = 0, S.0.0 is an appropriate value for s. We now get:
   n = 0 ∧ r = 0 ∧ s = 0
 ; while n ≠ N do
    n' = n+1  ∧  r' = r↑s'  ∧  s' = (s + X.n) ↑ 0
   od
This would be a simple enough program to execute except for S1 which has primed variables on both side of the equality. Let's see if we can find a sequence of assignments (rather than a conjunction of equalities) that will do the same as S0 ∧ S1 ∧ S2. The main law that we use for calculating assignments is
 (2) s := E ; P  ≡  P [s \ E] 
where P [s \ E] is the same as P except for the occurrences of s which are replaced by E. The way in which we will make assignments appear is by replacing terms of the initial specification by specifications of the form x' = x so that we can eventually replace the whole boolean specification by a skip and keep only the assignments.
    n' = n+1  ∧  r' = r↑s'  ∧  s' = (s + X.n) ↑ 0
  =   { Let's deal with s' to introduce s' = s 
         and solve the assignment to r }
      s := (s + X.n) ↑ 0  
    ; (n' = n+1 ∧ r' = r↑s' ∧ s' = s)
  =   { Leibniz }
      s := (s + X.n) ↑ 0
    ; (n' = n+1 ∧ r' = r↑s ∧ s' = s)
  =   { Assignment (either of r or n will do.  Let's pick
         them in inverse order of introduction }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; (n' = n+1 ∧ r' = r ∧ s' = s)
  =   { One last time, assignment }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; n := n+1  
    ; (n' = n ∧ r' = r ∧ s' = s)
  ⇐   { Introduce skip }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; n := n+1  
    ; skip
  =   { Identity of ; }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; n := n+1
It is a bit disappointing to see that the order between the assignment to r and that to n is irrelevant and we could have used a multiple simultaneous assignment for them but, then, we would have noticed that the choice of grouping n with r or grouping it with s is also irrelevant. One way or another, an irrelevant choice has to be made. By interpreting the initialization in a similar way, we get the following program:
   n, r, s := 0, 0, 0
   { invariant J0 ∧ J1 }
 ; while n ≠ N do
      s := (s + X.n) ↑ 0
    ; r := r ↑ s
    ; n := n + 1
   od

Monday, August 30, 2010

Modularity in Design
versus
Modularity in Code

At the moment, the widespread use of libraries is characterized by the reuse of code in various contexts outside the one which gave it birth. It has been made much easier by the development of .NET --which I salute as being a great technical achievement-- because .NET allows the combination of code written in different languages. I believe it is a stepping stone with respect to exposing programming languages as reasoning aids rather than execution models and it is only hampered by Microsoft's lack of commitment towards portability.

However, I suggest that, although it is an important step, it is not satisfactory as far as the concern for reuse goes. I propose that we start thinking of the reuse of the thoughts behind the code and that they are the things of foremost importance. It does not mean that reusing code is not important -- assuming we preserve the documentation of its interfaces-- but it does not go far enough. Indeed, in code, all the different ideas justifying the design choices are mangled together and one can hardly look at each aspect in isolation. In the community of programming languages, several attempts have been made to provide better modules but, so far, we still see a lot of code where different concerns end up intertwined with one another and I expect it will remain the case.

I think that a good modularity cannot be achieved in (what is understood today as) a programming language because of the universal assumption that the code a programmer types has to be directly (through compilation) executable.

My favorite example comes from parallel programming. In the field of programming language design, many mechanisms have been proposed to make the task of parallel programming easier. It includes semaphores, monitors and message passing. They make it easier to avoid low level design errors like race conditions but they still don't address design issues. In each case, the code is still organized in processes; since the behavior of a parallel program is really a property of the combination of the processes, a change to said behavior requires a change of many modules when the modular boundaries are drawn around processes.

Aside: I consider threads to be a special kind of processes and I don't think that, at the design level, differentiating between the two is of any help at all. I will therefore not mention threads again. (end of aside)

As a comparison, UNITY ([CM88]) uses action systems as the basic unit of composition. An action system has some local variables and can share variables with its environment. It also has a set of atomic actions that can modify the variables. The important thing is that it is possible to make assumptions about how the data is going to be manipulated, to enforce the assumptions and to use them to prove some safety and progress properties. When someone wants to change the functionality of a program, either he can add a new module built around the new properties or he can add the properties to an existing module, if it is close enough to what the module was built for. The key here is to see that programs properties (specifically safety and progress properties) are closely mapped to modules so we have a very good encapsulation.

At the end of a development, a UNITY system consists of a set of actions and a set of variables both partitioned between modules. If someone wants to execute the system, he needs first to choose the platform on which he wants to execute it (e.g. it could be a network of computers or a network of processes residing on one computer) and then partition the various variables and actions of the system between the components of the platform without consideration for the modular boundaries. Some variables might become channels between processes or between computers, others may be stored in the memory of a computer, private to one process or shared and possibly protected by locks.

In sequential programming, the control flow is something that can be closely tied to program properties and it can therefore be safely attributed to a modular unit --such as a procedure-- because the properties of interest --preconditions, postconditions and termination-- closely apply to syntactic structures describing the control flow. That is to say that pre- and postconditions are paired and, between them lies a sequence of statement which are structured in a proper fashion for proving correctness. Similarly, loops are associated with invariants and variants and nothing outside of the loop is relevant for reasoning about said loop.

In contrast, [FvG99] shows the kind of proof obligation one has to contend with when working with processes in parallel programming. In their theory, each assertion has local correctness, which is analogous to partial correctness in sequential programming, and a global correctness which relies on the structure of the other processes. At this point, it is unthinkable to take one process away and to expect that the properties one has proven about it will hold in a different environment. In short, the interface between processes, when made explicit, is too far from a thin interface to allow processes to properly encapsulate anything.

In UNITY, the fact that it is based on action systems rather than processes makes it possible to attribute a precise and small set of properties to a module and work in isolation on it. The usual process is called refinement and it makes all the intermediate design decisions explicit. It makes the effort spent on the development easier to transpose to different situation even when none of the code produced is actually useful.

The UNITY method has proof rules and formal semantics and the most rigorous approaches will take advantage of them. However, it is not necessary to use the whole battery to reap the benefits of the method. Since modularity is something they did well with UNITY, using it and its temporal logic as a guideline for design can already be beneficial in pointing out what we must be careful about.

In short, I believe that it is a property of paramount importance for modular boundaries to confine each programming task to the construction or modification of only one module. The best possible way to achieve that is to choose a unit of modularity for which the proofs of correctness need only rely on very few, simple and chosen a priori --that is, chosen as much as possible independently of any implementation-- assumptions of the surrounding environment. Some might say that it is an obvious goal but as far as I can see, very few modular decomposition mechanisms are satisfactory in that respect and I think that most language designers are still confined by "how the program is going to be executed" so are unable to meet my criteria.

*                         *
*

I know of another effort directed at the reuse of design: the so-called design patterns. The idea is a good one but its realization is much too weak. They are basically pattens of modular decomposition which have proved useful in many situations. However, there is no basis to state precisely what properties make them useful or how they meet those properties. In short, they cannot have an existence of their own because no notion of correctness apply to them. I do recognize it as a step in the right direction but it seems over-hyped for what it provides. I would welcome the publication of such books where designs are proven correct, where it is (formally) precise where the modules fit in a larger design and how one should proceed to code them correctly.

Next, I would have like to address some of the objections to formal methods that I have heard again and again and that used to unsettle me. However, this post is already pretty long and, by now, it is long overdue. I will, therefore, leave it for a post which, hopefully, will be coming soon.

Simon Hudon
April 10th, 2011
Meilen

References

[CM88] K.M. Chandy and J. Misra, Parallel Program Design:
       A Foundation, Addison-Wesley, 1988

[FvG99] W.H.J. Feijen, A.J.M. van Gasteren, On a Method of
       Multiprogramming, Springer-Verlag New York, Inc.,
       1999

Thursday, August 5, 2010

The Development of a Solution to the Problem of the Cubes

Note:
Before I begin, let me point out that the publication of this blog entry is made possible because I haven't used Blogger's interface to edit it (or at least, I used it as little as possible). Instead, I wrote it in a plain text editor using a markup language I am developping. Since I don't have developed an HTML generator yet, I had to convert it myself to HTML. Blogger did not make it easy but I think the result is much better than what I got previously by just fighting with the web interface. I would certainly appreciate any feedback concerning the resulting layout.
(end of note)

The first statement of this problem that I have seen is in Wim Feijen's note WF114 [0]. I really like the structure of his presentation but, whenever I have reproduced it, either for myself of to present the problem and a method to find a solution to somebody else, I have made on slight change: instead of using predicate transformers semantics, I use a relational semantics and I believe the switch to be quite significant from a practical standpoint. I still use the invariant based loop development techniques developped by Dijkstra & cie instead of the recursion / precondition technique of Hehner and I think it is very characteristic for my style. It shows that I prefer to manipulate one small part of the program at a time.

DEVELOPMENT ===========

The problem at hand consists in producing an array with the N first cubes without resorting to multiplication operations. Formally, the postcondition we are trying to satisfy is:
 P0:   (A i: 0 ≤ i < N: f.i = i^3)
with f the output variable (a function). About notation '.' is the function application operator and (A i: R: T) is a universal quantification over the range R and with the term T. Here, we use the property that ≤ and < are mutually conjunctional which means that their combined application can be unfolded into
 a ≤ b < c  ≡  a ≤ b ∧ b < c
They are also mutually conjunctional with equality (=) whereas logical equivalence (≡) is not conjunctinal at all because it is associative and the two features of the operator would interfere. (end of note) We can now proceed to find a suitable invariant for our loop. A good technique to do so is to replace some constants in the postcondition by variables. Since it is a quantification, it seems reasonable to focus our search in the range. We find 0 and N. We could choose either or both but if we use 0 as our initial value and N as the final one, we get the first cube for free. We will therefore opt for replacing only N with a variable:
 E0:   k = N
 J0:   (A i: 0 ≤ i < k: f.i = i^3)
If we start k off at 0, the range of J0 is becomes false and the invariant is therefore trivially true. From there, it is not a big step to say that increasing k regularly will lead us to E0 (our variant is then N-k).
 I0:   k = 0
 S0:   k' = k + 1
It is now time to see how we will maintain J0 (in the loop) while applying S0 to converge. Note on proof obligations If we use the technique of the invariant to build our loop, we have to fulfill a certain certain proof obligation. Assuming that J is the invariant, B is the body of the loop, E is the exit condition, IN is the initialization of the loop and P is the postcondition:
 (0)   J'  ⇐  J ∧ B ∧ ¬ E    "B (and E) preserve(¬s) J"
 (1)   J  ⇐  IN              "IN establishes J"
 (2)   P  ⇐  J ∧ E           "J and E lead to P"
In the preceding definitions, primed predicates are predicates for which all unprimed program variables are replaced with their primed version. Unprimed variables designate their initial values (before the execution of a given statement or program) and the primed variables designate their final values.
 (3)     (B and E preserve J) 
       ∧ (IN establishes J) 
       ∧ (J and E lead to P)
    ≡ 
       { IN } until E do B od { P }
The last line is to be interpreted as a Hoare triple with assertions between brackets. It reads "started in a state satisficing IN, the program [here the while loop], finishes in a state satisficing P." (end of note)
   (A i: 0 ≤ i < k': f'.i = i^3)
 =   { S0 }
   (A i: 0 ≤ i < k + 1: f'.i = i^3)
 =   { Split off one term }
   (A i: 0 ≤ i < k: f'.i = i^3) ∧ f'.k = k^3
 =   { Modify only f.k; S1: (A i: 0 ≤ i < k: f'.i = f.i) }
   J0 ∧ f'.k = k^3
 S1:   (A i: 0 ≤ i < k: f'.i = f.i)
The last line of the provious proof makes a nice assignment. The only problem is that it requires some multiplications, we will continue out calculations rather than adopt it as our next assignment.
   f'.k = k^3
 =   { Let's introduce a new variable to hold k^3 }
     { at all time.  J1: a = k^3                  }
   f'.k = a
 S2:   f'.k = a
 J1:   a = k^3
If we added the symmetric complement of S1, that is that everything beyond k in the values of f stays unchanged but it is not required yet. We add it afterwards to complement S1 and S2 as array assignments but it is not urgent.
 (4)   (J0'  ≡  J0 ∧ S2)  ⇐  J1 ∧ S0 ∧ S1
Since the left argument of ⇐ is monotonic, we can weaken (4) by weakening its left-hand side and replacing the equivalence by a consequence (⇐). After applying the shunting rule (see (A) in the appendix), we have exactly the shape of the proof obligation for maintaining an invariant... except that invariant used to maintain J0 (i.e. J0 ∧ J1) is much stronger than J0. We can prove that we can maintain it too, though.
 (5)   J0'  ⇐  J0 ∧ J1 ∧ S0 ∧ S1 ∧ S2
To maintain J1, we will split it in two, manipulate the left-hand side of the equality since we know a lot about k and hope it will lead to operations on a.
   k'^3
 =   { S0 }
   (k + 1)^3
 =   { Unfold _ ^3; ∙ over + }
   k^3 + 3∙k^2 + 3∙k + 1
 =   { J1 }
   a + 3∙k^2 + 3∙k + 1
 =   { We need a new variable to hold the term          }
     { which contains a product J2: b = 3∙k^2 + 3∙k + 1 }
   a + b
 =   { We want to fall back on J1'.  S3: a' = a + b is  }
     { good candidate.                                  }
   a'
 J2:   b = 3∙k^2 + 3∙k + 1
 S3:   a' = a + b
 (6)   J1'  ⇐  J1 ∧ J2 ∧ S0 ∧ S3
It seems like we have just created some more work for ourselves but we could be reasured by the sight of the decreasing order of the equations in our new invariants.
   3∙k'^2 + 3∙k' + 1
 =   { S0 }
   3∙(k + 1)^2 + 3∙(k + 1) + 1
 =   { ∙ over + }
   3∙k^2 + 6∙k + 3 + 3∙k + 3 + 1
 =   { J2 }
   b + 6∙k + 6
 =   { You know the drill! J3: c = 6∙k + 6 }
   b + c
 =   { S4: b' = b + c }
   b'
 J3:   c = 6∙k + 6
 S4:   b' = b + c
 (7)   J2' ⇐  J2 ∧ J3 ∧ S0 ∧ S4
Now for the last touch on the invariant:
   6∙k' + 6
 =   { S0 }
   6∙k + 6 + 6
 =   { J3 }
   c + 6
 =   { S5: c' = c + 6 }
   c'
 S5:   c' = c + 6
 (8)   J3'  ⇐  J3 ∧ S0 ∧ S5
We can now happily conclude
 (9)   S6 preserves J4
with
 S6:   S0 ∧ S1 ∧ S2 ∧ S3 ∧ S4 ∧ S5
 J4:   J0 ∧ J1 ∧ J2 ∧ J3
We started off knowing that J0 ∧ E0 ⇒ P0 so all that is left for a correct loop is to find the initialization. We already have I0 and, from there, we can easily calculate the initial values of the auxiliary variables. Let's ignore the calculations and see the initial values right away.
 I1:   a = 0
 I2:   b = 1
 I3:   c = 6
We can see that there is no value specified for f and, as a matter of fact, any value will do. We now have the final lemma for the correctness of this loop.
 (10)  I0 ∧ I1 ∧ I2 ∧ I3 establishes J0 ∧ J1 ∧ J2 ∧ J3
CONCLUSION ========== Looking at the end result, we can see that we ended up with a series of assignments (SX) but we haven't specified an order in which they have to be executed. It is because it is both simple and tedious to come up with an order and the order so arbitrary that it is all noise. For that reason, we can trust a compiler to do it for us. When we have to deal with a programming languages that doesn't support multiple simultanous assignments, we have to take the time to introduce the noise that is the order ourselves. The key here is that we must not overwrite a variable before we are done with its initial value. We have a bit of a problem when we run into an assignment where the use and the writing of variables are cyclicly dependant like in the following:
 S6:   x' = y ∧ y' = x
To solve the problem, we can add an auxiliary variable (say, t) and initialize it so that it can be substituted for every use of a certain variable.
 I4:   t = x
   x' = y ∧ y' = x
 =   { I4 }
   x' = y ∧ y' = t
 =   { Assignment law (B) }
   (x := y) ; (x' = x ∧ y' = t)
 =   { Assignment law (B) }
   (x := y) ; (y := t) ; (x' = x ∧ y' = y)
 ⇐    { Skip introduction (C) }
   (x := y) ; (y := t) ; skip
 =    { Identity of ; (D) }
   (x := y) ; (y := t)
And, of course, it is also simple to implement I4 and prepend it to the program we just derived. Note: Since we are interested in a program that ends up with I4, we are going to use primed variables. (end of note)
   t' = x ∧ x' = x ∧ y' = y
 =   { Assignment law (B) }
   (t := x) ; (t' = t ∧ x' = x ∧ y' = y)
 ⇐   { Skip introduction (C) }
   (t := x) ; skip
 =   { Identity of ; (D) }
   t := x
In short, this is why I don't bother to do it with the body of my loop. Finally, the main interest of this programming exercise does not lie in the resulting program. Quite frankly, it accomplishes quite a banal task. However, the technique which led us to it is quite systematic. We used very little knowledge to find it and, unlike Feijen's version, we did not introduce assignments of '?' to mark the necessity to create a new assignments to then 'guess' what should be the expression. To be honest, it is not a very hard guess but it is still outside of the scope of a calculation. The fact that I integrated that choice in a calculation makes it easier to justify. Regards, Simon Hudon ETH Zürich APPENDIX ======== Unusual laws of propositional calculus
(A)   A ∧ B ⇒ C  ≡  A ⇒ (B ⇒ C)    { "shunting" }
Hehner's programming laws [1]
(B)   (x := E) ; P  ≡  P [x \ E]     
              { "assignment law" P [x \ E] is the }
              { variable substitution notation,   }
              { P is a predicate, x a program     }
              { variable and E an expression      }
(C)   x' = x  ⇐   skip              
              { For any program variable x }
(D)   P ; skip  ≡   P                
              { For any program predicate P }
REFERENCES ========== [0] Wim Feijen, WF114 - The problem of the table of cubes, http://www.mathmeth.com/wf/files/wf1xx/wf114.pdf [1] Eric Hehner, A Practical Theory of Programming, http://www.cs.toronto.edu/~hehner/aPToP/

Sunday, August 1, 2010

A Prehistoric Qualification of Languages: Succinctness

I read this morning the essay Succinctness = Power [0] by Paul Graham which, so far, I find (note the present tense) to be quite smart. I can't believe he would use such a metric as lines of code or number of symbols as a meaningful metric to compare implementations with. He says that one of the goal of high level programming languages is to be more succinct for expressing a given amount of machine code. Of all people, I would expect that somebody who spent so much time with LISP would see the the fallacy. The idea and benefit of a high level language is that it allows you to think in higher level IDEAS. Those ideas can be implemented using half a machine code instruction or a thousand, it is of no relevance for its use. What IS relevant is whether the properties of that particular abstraction are good enough to fit in the context of the solutions but include very little (I am tempted to say none at all) information which does not pertain to the context it will be used in. And, finally, this is something LISP is interesting for because it drives home the message that those particular abstractions can be encapsulated in functions or data types but you can also build up the language to cover it too and It is not too say I prefer long solutions but I see succinctness as a consequence --and not the main one at that-- of a language's "power" (which I find to be a poor choice of adjective especially since he does not define how he uses it right away). Its real power lies in the variety of abstractions that one can define and use. If you exaggerate the aim for succinctness, you end up with APL or PERL which are notoriously bad. Their goal is not abstraction but packing as many operations in tiny expressions. As a consequence, those solutions are unreadable and they just seem like dignified way of writing short machine code. It gives no more flexibility and no more readability. This comment was mostly about the beginning the essay. I don't have much to say about the rest except that, at some points, he can expose some really nice insights but it really started off on the foot and I didn't expect the rest of the essay to compensate for that and, in my view, it didn't. [0] Paul Graham, Succinctness = Power, http://www.paulgraham.com/power.html Regards, Simon Hudon ETH Zürich August 6th 2010

Wednesday, July 28, 2010

I hate powerpoint slides

I am taking a break in my study session to write about powerpoint slides as they are used by lecturers. Actually, it is such a general phenomenon that powerpoint is more the name of one of the most popular product to produce those atrocities than the only one to do so. Since this is a general post for ranting, even if I try to rationalize what is going on, I will focus on how much I hate them. It seems that most of them are produced as a combination of cue cards for the lecturer and of visual entertainment for the childish students. It means that, although they can be effective at entertaining those with short attention span (and I mention that I am of those with a short attention span but not of those who are entertained) they are quite ineffective as basic material for any lectures. This is caused by their cue card structure. Those cards are used to give directions for a speaker and they need not contain complete sentences since the material should be in the speaker's head. As a consequence, bad English, ambiguous statements and inaccurate vocabulary plague them and it would be alright if the cue cards were to be either discarded after use or kept privately for future repetitions of the lecture by the same person. I don't know if it is possible to design proper slide to act both as visual aid and supporting material for a course but, as I am about to start an academic career, I am completely opposed to their use in my lectures. I would rather use a combination of reading assignments and blackboard presentation with some possible rare exception. That being said, I can now resume my study of the most empty (in academic content) course in my school experience. Best regards, Simon Hudon ETH Zürich July 28th 2010 Post Scriptum: I am still studying for the same exam and I realize that the slides are really full of motherhood statements [1] like "Repeat the experiments until you get a reasonable standard deviation". I'm not sure how usual this is though but I realize this is something that always annoys me when I see it anywhere. [1] "motherhood statements" is an expression I borrowed from Dijkstra, it designates a set of argumentative positions and advices to which it does not make sense to be opposed to. In the case of the analogy, if you don't qualify your statement, you cannot just say that you support motherhood because the opposite does not make sense. Meyer also calls them argumentative platitudes. end of post scriptum

Wednesday, July 7, 2010

About the Analogy between Verification and Compilation

Some people in the verification community have started responding to the complaint that formal proofs of program correctness are too long by making an analogy between verifiers and compilers:
Formal proofs of correctness are tedious in the same way coding in assembler is tedius. The use of tools, namely high level programming languages, solved the latter, and it can also solve the former.
I've seen it many times and I don't really know whom to credit for it. I think the analogy is fundamentally flawed. For one thing, the translation of high level programming language into assembler is a computable problem and its use is rather straightforward if the language is well designed and its semantics is as simple and systematic as it can be. We don't have any such luck with automated theorem provers. Since their job is to solve an uncomputable problem, they have to use lots of heuristics which make the interface with the user fuzzy. When submitting formula to the theorem prover, there is no way of saying if it will be found to be a theorem or not because the class of recognized theorems is not precisely defined. On the other hand, given a theorem (in our case the conformance of a given program to a given specification) the problem of generating a proof is well know to be uncomputable. But this should not deter us in seeking a formal backing for our design because, contrary to the folklore, writing a formal proof is not a tedious activity. It is hard but it is a very inventive process. It can become tedious though if one uses an inappropriate logic like, say Gentzen's Natural Deduction or like the sequent calculus. Although they are popular formal logic system, they are hardly the only ones available. It is not too say that they are useless formal systems, just like Turing Machines are not useless. They are not meant to be used but meant as a simplification for logicians and theoreticians to understand what tasks can be accomplished with the formal system. It is not proper for understanding how best to accomplish the tasks. Just like programming directly with Turing machines is a waste of effort, proving with Natural Deduction is also a waste of effort. It forces the user of making irrelevant distinctions, for example, with its rule for the proof of logical equivalence. The only way to prove it or even to use it is as a shorthand for two implications. Very often, you can exploit properties of logical equivalence in the calculations of theorems (like it is done in [0] and [1]) in at most half of the effort (and some times one fourth or less) that would be expended for the same proof in Natural Deduction. Contrarily to Natural Deduction, the use of calculational proofs allows one to focus on the subject matter and use the activity of proving as an improvement of his understanding rather than as a necessary evil for being certain of one's guesses. As a consequence, automating the construction of the significant proofs of correctness does not improve the productivity but decreases it because it removes any useful hints that the (human) prover could use to orient his efforts in the right direction. In short, in all important respect, the analogy is invalid and cannot be used to the defend the necessity of automating theorem proving. It is so first because, of the two compared activities, only one solves a computable problem, and, second, one of them is tedious and repetitive whereas the other involves a lot creativity. The fallacy comes from talking some properties of inappropriate formal systems as being valid for all of them. To be effective at proving, we don't only need a logic which is sound and theoretically powerful enough, we need a formalism suited as a carrier for reasoning. References [0] On the Shape of Mathematical Arguments, A.J.M. van Gasteren [1] Predicate Calculus and Program Semantics, E.W. Dijkstra and C.S. Scholten

Monday, July 5, 2010

Quality Criteria for Writing

I have been less active in writing lately, and lots of explanations are possible. I possibly have ranted to my heart's content about my situation in Zürich and the sins of the industrial and academic communities but that's probably overly exaggerated. As I discuss with people, try to be more nuanced in my opinions, I find new and interesting faults to describe, some of which I have started an entry about and never finished.
I stumbled upon EWD 1068 today which I had never read and which makes me presume that I might have mispostulated (please accept the offering of that verb) my audience as not being curious or sympathetic. In any case, I should know better than to think that people interested in my fabulation are not curious; this is very rarely a step by step tutorial to do anything in particular.
I find that, mostly, I feel myself shackled by Blogger. Whenever I find a very beautiful proof that I would like to share on my blog, half an hour to an hour of struggling with the layout dissuade me of sharing that kind of logical poetry. That is really a shame and I am yet unsure about what I'm going to do about it.
In any case, enjoy the reading of EWD 1068, I found it quite interesting.
Simon Hudon
ETH Zürich
July 5th, 2010

Wednesday, June 2, 2010

Some Light Technical Reading

I realize that I haven't posted anything in a while and as an apology I will explain to my readers (all four of them!) that I have several post in a state of a draft which I cannot bring to a satisfactory state. To show that I am still alive and interested in writing I offer you a reference to a proof I read today. I wanted to do the same thing but the solution is really delightful and I didn't think I could top it. So without further ado, here is an essay from Dijkstra's EWD series:
(for the interested, all or most of his essays can be obtained here:
Simon Hudon
Wednesday, June 2, 2010
ETH Zürich