Simon's blog prime: programming

Showing posts with label programming. Show all posts

Monday, July 7, 2014

Where have I been?

It has been a while since my last blog post. I have a collection of posts that I have been meaning to posts but that were somehow never good enough. I'm going to try harder to publish them. In the mean time, here is a snapshot of what has been going on with me in the last two years... at least scientifically.

I underwent a peculiar transformation since I moved from Zürich to Toronto. I used to think that you can study a software system purely from a logical point of view and, when you proved all you needed to prove, all is left to do is run the software. I have done it for some non-trivial software system but I realize that, for a bigger piece of software, the amount of work to cover the whole thing would be excessive both in the sense of "exceeding necessity" and "exceeding ability to do so". I'm reworking my vision of software design as an activity that includes proving theorem but which also comprises other techniques of validation. For example, random testing as done by QuickCheck in Haskell is often a good compromise. The key is to choose which technique applies best.

For instance, I started developing a program one year ago and it has now become relatively big (considering that I'm the only person developing it) and, to my surprise, it works well even though I didn't prove a single property about it. In the back of my mind however, I keep pulling some parts of the design aside and figuring out the logic that would make it possible to prove that it is correct. One of these days, I'll prove the correctness of some crucial parts of my programs and I think that will be both really interesting and really useful -- especially considering that the program is a formal verifier.

On the other hand, I am also discovering the interplay between mastering the theory and having good software tools for designing software. I used to think that if you know the theory well, you have no need for tools. I'm realizing that this is a pretty silly point of view: the program that I mentioned above is specialized in writing proofs for simple conjectures. It very incredibly fast. In one of the small systems that I use that tool for, there are around 500 conjectures that need to be proved. Most of them are incredibly easy to prove but they still need to be proved. My tool takes about 30s to prove 499 of them leaving me with the truly difficult theorem. That's incredibly helpful because now, I can focus my attention on the hardest of the hard design problems. The best part is, compared to an interactive prover, I do not need to manually invoke the prover on each proof obligation. They will all be attempted automatically except for those with a proof provided by the user.

The tool is designed to allow me to verify Unit-B models. I don't think I would be able to use Unit-B as confidently as I do and as efficiently as I do without it. Even better, when using Unit-B to develop the very first examples, every time I was faced with a problem, I could improvise a solution and think really hard and figure out that it was valid. This is very inconvenient to teach the method because I hardly have a list of all the tricks that I use. By implementing a verifier, I need to identify the truly useful techniques and make sure that they can be used with confidence and have a clear explanation to disallow whatever I choose to exclude. In other words, I'm now forced to formulate the rules of the method independently of any one example.

So much for simplistic views of the world.

Tuesday, July 5, 2011

On Function Purity and Simple Framing

Here is the latest blog post by Bertrand Meyer.

http://bertrandmeyer.com/2011/07/04/if-im-not-pure-at-least-my-functions-are/

It talks about the problem of function purity and broaches the subject of framing. In short, the frame of the specification of a routine is the set of variables (in the broad meaning of the term) that the execution of the routine can modify. In object oriented programming, it is especially tricky because of the sets of dynamically allocated objects of which it might be necessary to alter the state in a given routine. Indeed such set can be taken to be the set of objects reachable from a given root. In a limited solution, such as shown by Meyer in his latest post, you only name variables and expressions that can change. In more elaborate solutions, such as the dynamic frames of Kassios (as quoted by Meyer), it is possible to provide a set of objects and the attributes that can be expected to change. It is especially handy for specifying dynamic data structures like linked lists.

Such a solution seems to present the disadvantage of disallowing the extension of frames through inheritance, which Meyer preserves. I think a compromise can be reached between the two but it is a bit too involved for a post only intended for sharing another blog post.

Monday, October 18, 2010

The Segment of Maximum Sum

This is the recording of a solution found together with Bernhard Brodowsky where he did most of the work in one of the first lectures where I explained the techniques to him. The problem is to find a segment of an array for which the sum of the elements is maximal. We stopped when we had a simple specification of the body of the loop for fear that actually calculating the assignments would be too much tedium. In this recording, I will calculate them to see exactly how tedious it is. Also, we did not pay any mind to calculating too many auxiliary values --that is using them before updating the variable holding it in the loop body so that one is always thrown away--. In this recording, the solution will be changed slightly to avoid that.

First, here is the postcondition that we want to implement:

 P0: r = 〈↑ p, q: 0 ≤ p ≤ q ≤ N: 〈∑ i: p ≤ i < q: X.i〉〉

where ↑ denotes the maximum of two operands. Like the conjunction and the disjunction, it is idempotent, associative and symmetric so we can use it in the same was as the universal and existential quantification. The first thing to do is to introduce a name for the summation part so that we can manipulate it independently of the overall specification.

 (0) S.p.q = 〈∑ i: p ≤ i < q: X.i〉

And the specification becomes:

 P1: r = 〈↑ p, q: 0 ≤ p ≤ q ≤ N: S.p.q〉

The traditional way to get an invariant from this is to replace N by a variable, let's call it n, and use n = N as an exit condition:

 J0: r = 〈↑ p, q: 0 ≤ p ≤ q ≤ n: S.p.q〉 
 C0: n = N

By symmetry, we can assume that 0 is a proper initial value for n which makes S.0.0 a good one for r.

 I0: n = 0
 I1: r = 0

Last formality: having n increase by one at each iteration will guarantee termination provided 0 ≤ N.

 S0: n' = n + 1

Let's now start calculating with J0 to see what adjustments we have to make to r to preserve it. Two distinct heuristics get us to start calculating with the right-hand side of the equality: it is much more complicated than the left-hand side so we can assume that most of our moves will be eliminations and all the program variable involved in the right-hand side don't have a matching assignment so we would basically be stuck right at the start. For a given loop with invariant J and body S, the invariance of J is assured by:

 J'  ⇐  J ∧ S

where J' is J with all its program variables primed. It means we can use whatever is part of the body or the invariant to prove J's invariance, to do so, we can invent new statements for the body as we go.

    〈↑ p, q: 0 ≤ p ≤ q ≤ n': S.p.q〉 
  =   { S0 }
    〈↑ p, q: 0 ≤ p ≤ q ≤ n+1: S.p.q〉 
  =   { Split range }
      〈↑ p, q: 0 ≤ p ≤ q ≤ n: S.p.q〉
    ↑ 〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉 
  =   { Use final value of a new variable using
         Q0: s' = 〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉 and J0 }
    r ↑ s'
  =   { New statement: S1: r' = r ↑ s' to conclude
         the proof of J0's invariance }
    r'

 Q0: s' = 〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉
 S1: r' = r ↑ s'

Q0 links unprimed variables with primed variables in a way that cannot be evaluated. Let's turn it into an invariant to fix this.

    〈↑ p: 0 ≤ p ≤ n+1: S.p.(n+1)〉
  =   { S0 }
    〈↑ p: 0 ≤ p ≤ n': S.p.n'〉
  =   { J1' }
    s'

where

 J1: s = 〈↑ p: 0 ≤ p ≤ n: S.p.n〉

We now have to make sure that J1 is also invariant and use the same heuristic as for J0 to do so.

    〈↑ p: 0 ≤ p ≤ n': S.p.n'〉
  =   { S0 and split }
    〈↑ p: 0 ≤ p ≤ n: S.p.(n+1)〉↑ S.(n+1).(n+1)
  =   { (1), see below ; S.n.n = 0 }
    〈↑ p: 0 ≤ p ≤ n: S.p.n + X.n〉↑ 0
  =   { + over ↑ }
    (〈↑ p: 0 ≤ p ≤ n: S.p.n〉+  X.n) ↑ 0
  =   { J1 }
    (s + X.n) ↑ 0
  =   { New statement: S2: s' = (s + X.n) ↑ 0 }
    s'

 S2: s' = (s + X.n) ↑ 0

 (1) S.p.(n+1) = S.p.n + X.n   ⇐   p ≤ n

(1) is a theorem that we calculated because we had a term S.p.(n+1) that we needed to replace by one formulated in terms of S.p.n to make it possible to use J1 to eliminate the quantified maximum. Proof of (1):

    S.p.(n+1)
  =   { (0) }
    〈∑ i: p ≤ i < n+1: X.i〉
  =   { Split range using p ≤ n }
    〈∑ i: p ≤ i < n: X.i〉+  X.n
  =   { (0) }
    S.p.n + X.n

end of proof All that is missing now to have a complete program is to add an initialization for s. An examination of J1 tells us that for n = 0, S.0.0 is an appropriate value for s. We now get:

   n = 0 ∧ r = 0 ∧ s = 0
 ; while n ≠ N do
    n' = n+1  ∧  r' = r↑s'  ∧  s' = (s + X.n) ↑ 0
   od

This would be a simple enough program to execute except for S1 which has primed variables on both side of the equality. Let's see if we can find a sequence of assignments (rather than a conjunction of equalities) that will do the same as S0 ∧ S1 ∧ S2. The main law that we use for calculating assignments is

 (2) s := E ; P  ≡  P [s \ E]

where P [s \ E] is the same as P except for the occurrences of s which are replaced by E. The way in which we will make assignments appear is by replacing terms of the initial specification by specifications of the form x' = x so that we can eventually replace the whole boolean specification by a skip and keep only the assignments.

    n' = n+1  ∧  r' = r↑s'  ∧  s' = (s + X.n) ↑ 0
  =   { Let's deal with s' to introduce s' = s 
         and solve the assignment to r }
      s := (s + X.n) ↑ 0  
    ; (n' = n+1 ∧ r' = r↑s' ∧ s' = s)
  =   { Leibniz }
      s := (s + X.n) ↑ 0
    ; (n' = n+1 ∧ r' = r↑s ∧ s' = s)
  =   { Assignment (either of r or n will do.  Let's pick
         them in inverse order of introduction }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; (n' = n+1 ∧ r' = r ∧ s' = s)
  =   { One last time, assignment }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; n := n+1  
    ; (n' = n ∧ r' = r ∧ s' = s)
  ⇐   { Introduce skip }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; n := n+1  
    ; skip
  =   { Identity of ; }
      s := (s + X.n) ↑ 0  
    ; r := r↑s  
    ; n := n+1

It is a bit disappointing to see that the order between the assignment to r and that to n is irrelevant and we could have used a multiple simultaneous assignment for them but, then, we would have noticed that the choice of grouping n with r or grouping it with s is also irrelevant. One way or another, an irrelevant choice has to be made. By interpreting the initialization in a similar way, we get the following program:

   n, r, s := 0, 0, 0
   { invariant J0 ∧ J1 }
 ; while n ≠ N do
      s := (s + X.n) ↑ 0
    ; r := r ↑ s
    ; n := n + 1
   od

Thursday, August 5, 2010

The Development of a Solution to the Problem of the Cubes

Note:
Before I begin, let me point out that the publication of this blog entry is made possible because I haven't used Blogger's interface to edit it (or at least, I used it as little as possible). Instead, I wrote it in a plain text editor using a markup language I am developping. Since I don't have developed an HTML generator yet, I had to convert it myself to HTML. Blogger did not make it easy but I think the result is much better than what I got previously by just fighting with the web interface. I would certainly appreciate any feedback concerning the resulting layout.
(end of note)

The first statement of this problem that I have seen is in Wim Feijen's note WF114 [0]. I really like the structure of his presentation but, whenever I have reproduced it, either for myself of to present the problem and a method to find a solution to somebody else, I have made on slight change: instead of using predicate transformers semantics, I use a relational semantics and I believe the switch to be quite significant from a practical standpoint. I still use the invariant based loop development techniques developped by Dijkstra & cie instead of the recursion / precondition technique of Hehner and I think it is very characteristic for my style. It shows that I prefer to manipulate one small part of the program at a time.

DEVELOPMENT ===========

The problem at hand consists in producing an array with the N first cubes without resorting to multiplication operations. Formally, the postcondition we are trying to satisfy is:

 P0:   (A i: 0 ≤ i < N: f.i = i^3)

with f the output variable (a function). About notation '.' is the function application operator and (A i: R: T) is a universal quantification over the range R and with the term T. Here, we use the property that ≤ and < are mutually conjunctional which means that their combined application can be unfolded into

 a ≤ b < c  ≡  a ≤ b ∧ b < c

They are also mutually conjunctional with equality (=) whereas logical equivalence (≡) is not conjunctinal at all because it is associative and the two features of the operator would interfere. (end of note) We can now proceed to find a suitable invariant for our loop. A good technique to do so is to replace some constants in the postcondition by variables. Since it is a quantification, it seems reasonable to focus our search in the range. We find 0 and N. We could choose either or both but if we use 0 as our initial value and N as the final one, we get the first cube for free. We will therefore opt for replacing only N with a variable:

 E0:   k = N
 J0:   (A i: 0 ≤ i < k: f.i = i^3)

If we start k off at 0, the range of J0 is becomes false and the invariant is therefore trivially true. From there, it is not a big step to say that increasing k regularly will lead us to E0 (our variant is then N-k).

 I0:   k = 0
 S0:   k' = k + 1

It is now time to see how we will maintain J0 (in the loop) while applying S0 to converge. Note on proof obligations If we use the technique of the invariant to build our loop, we have to fulfill a certain certain proof obligation. Assuming that J is the invariant, B is the body of the loop, E is the exit condition, IN is the initialization of the loop and P is the postcondition:

 (0)   J'  ⇐  J ∧ B ∧ ¬ E    "B (and E) preserve(¬s) J"
 (1)   J  ⇐  IN              "IN establishes J"
 (2)   P  ⇐  J ∧ E           "J and E lead to P"

In the preceding definitions, primed predicates are predicates for which all unprimed program variables are replaced with their primed version. Unprimed variables designate their initial values (before the execution of a given statement or program) and the primed variables designate their final values.

 (3)     (B and E preserve J) 
       ∧ (IN establishes J) 
       ∧ (J and E lead to P)
    ≡ 
       { IN } until E do B od { P }

The last line is to be interpreted as a Hoare triple with assertions between brackets. It reads "started in a state satisficing IN, the program [here the while loop], finishes in a state satisficing P." (end of note)

   (A i: 0 ≤ i < k': f'.i = i^3)
 =   { S0 }
   (A i: 0 ≤ i < k + 1: f'.i = i^3)
 =   { Split off one term }
   (A i: 0 ≤ i < k: f'.i = i^3) ∧ f'.k = k^3
 =   { Modify only f.k; S1: (A i: 0 ≤ i < k: f'.i = f.i) }
   J0 ∧ f'.k = k^3

 S1:   (A i: 0 ≤ i < k: f'.i = f.i)

The last line of the provious proof makes a nice assignment. The only problem is that it requires some multiplications, we will continue out calculations rather than adopt it as our next assignment.

   f'.k = k^3
 =   { Let's introduce a new variable to hold k^3 }
     { at all time.  J1: a = k^3                  }
   f'.k = a

 S2:   f'.k = a
 J1:   a = k^3

If we added the symmetric complement of S1, that is that everything beyond k in the values of f stays unchanged but it is not required yet. We add it afterwards to complement S1 and S2 as array assignments but it is not urgent.

 (4)   (J0'  ≡  J0 ∧ S2)  ⇐  J1 ∧ S0 ∧ S1

Since the left argument of ⇐ is monotonic, we can weaken (4) by weakening its left-hand side and replacing the equivalence by a consequence (⇐). After applying the shunting rule (see (A) in the appendix), we have exactly the shape of the proof obligation for maintaining an invariant... except that invariant used to maintain J0 (i.e. J0 ∧ J1) is much stronger than J0. We can prove that we can maintain it too, though.

 (5)   J0'  ⇐  J0 ∧ J1 ∧ S0 ∧ S1 ∧ S2

To maintain J1, we will split it in two, manipulate the left-hand side of the equality since we know a lot about k and hope it will lead to operations on a.

   k'^3
 =   { S0 }
   (k + 1)^3
 =   { Unfold _ ^3; ∙ over + }
   k^3 + 3∙k^2 + 3∙k + 1
 =   { J1 }
   a + 3∙k^2 + 3∙k + 1
 =   { We need a new variable to hold the term          }
     { which contains a product J2: b = 3∙k^2 + 3∙k + 1 }
   a + b
 =   { We want to fall back on J1'.  S3: a' = a + b is  }
     { good candidate.                                  }
   a'

 J2:   b = 3∙k^2 + 3∙k + 1
 S3:   a' = a + b

 (6)   J1'  ⇐  J1 ∧ J2 ∧ S0 ∧ S3

It seems like we have just created some more work for ourselves but we could be reasured by the sight of the decreasing order of the equations in our new invariants.

   3∙k'^2 + 3∙k' + 1
 =   { S0 }
   3∙(k + 1)^2 + 3∙(k + 1) + 1
 =   { ∙ over + }
   3∙k^2 + 6∙k + 3 + 3∙k + 3 + 1
 =   { J2 }
   b + 6∙k + 6
 =   { You know the drill! J3: c = 6∙k + 6 }
   b + c
 =   { S4: b' = b + c }
   b'

 J3:   c = 6∙k + 6
 S4:   b' = b + c

 (7)   J2' ⇐  J2 ∧ J3 ∧ S0 ∧ S4

Now for the last touch on the invariant:

   6∙k' + 6
 =   { S0 }
   6∙k + 6 + 6
 =   { J3 }
   c + 6
 =   { S5: c' = c + 6 }
   c'

 S5:   c' = c + 6

 (8)   J3'  ⇐  J3 ∧ S0 ∧ S5

We can now happily conclude

 (9)   S6 preserves J4

with

 S6:   S0 ∧ S1 ∧ S2 ∧ S3 ∧ S4 ∧ S5
 J4:   J0 ∧ J1 ∧ J2 ∧ J3

We started off knowing that J0 ∧ E0 ⇒ P0 so all that is left for a correct loop is to find the initialization. We already have I0 and, from there, we can easily calculate the initial values of the auxiliary variables. Let's ignore the calculations and see the initial values right away.

 I1:   a = 0
 I2:   b = 1
 I3:   c = 6

We can see that there is no value specified for f and, as a matter of fact, any value will do. We now have the final lemma for the correctness of this loop.

 (10)  I0 ∧ I1 ∧ I2 ∧ I3 establishes J0 ∧ J1 ∧ J2 ∧ J3

CONCLUSION ========== Looking at the end result, we can see that we ended up with a series of assignments (SX) but we haven't specified an order in which they have to be executed. It is because it is both simple and tedious to come up with an order and the order so arbitrary that it is all noise. For that reason, we can trust a compiler to do it for us. When we have to deal with a programming languages that doesn't support multiple simultanous assignments, we have to take the time to introduce the noise that is the order ourselves. The key here is that we must not overwrite a variable before we are done with its initial value. We have a bit of a problem when we run into an assignment where the use and the writing of variables are cyclicly dependant like in the following:

 S6:   x' = y ∧ y' = x

To solve the problem, we can add an auxiliary variable (say, t) and initialize it so that it can be substituted for every use of a certain variable.

 I4:   t = x

   x' = y ∧ y' = x
 =   { I4 }
   x' = y ∧ y' = t
 =   { Assignment law (B) }
   (x := y) ; (x' = x ∧ y' = t)
 =   { Assignment law (B) }
   (x := y) ; (y := t) ; (x' = x ∧ y' = y)
 ⇐    { Skip introduction (C) }
   (x := y) ; (y := t) ; skip
 =    { Identity of ; (D) }
   (x := y) ; (y := t)

And, of course, it is also simple to implement I4 and prepend it to the program we just derived. Note: Since we are interested in a program that ends up with I4, we are going to use primed variables. (end of note)

   t' = x ∧ x' = x ∧ y' = y
 =   { Assignment law (B) }
   (t := x) ; (t' = t ∧ x' = x ∧ y' = y)
 ⇐   { Skip introduction (C) }
   (t := x) ; skip
 =   { Identity of ; (D) }
   t := x

In short, this is why I don't bother to do it with the body of my loop. Finally, the main interest of this programming exercise does not lie in the resulting program. Quite frankly, it accomplishes quite a banal task. However, the technique which led us to it is quite systematic. We used very little knowledge to find it and, unlike Feijen's version, we did not introduce assignments of '?' to mark the necessity to create a new assignments to then 'guess' what should be the expression. To be honest, it is not a very hard guess but it is still outside of the scope of a calculation. The fact that I integrated that choice in a calculation makes it easier to justify. Regards, Simon Hudon ETH Zürich APPENDIX ======== Unusual laws of propositional calculus

(A)   A ∧ B ⇒ C  ≡  A ⇒ (B ⇒ C)    { "shunting" }

Hehner's programming laws [1]

(B)   (x := E) ; P  ≡  P [x \ E]     
              { "assignment law" P [x \ E] is the }
              { variable substitution notation,   }
              { P is a predicate, x a program     }
              { variable and E an expression      }
(C)   x' = x  ⇐   skip              
              { For any program variable x }
(D)   P ; skip  ≡   P                
              { For any program predicate P }

REFERENCES ========== [0] Wim Feijen, WF114 - The problem of the table of cubes, http://www.mathmeth.com/wf/files/wf1xx/wf114.pdf [1] Eric Hehner, A Practical Theory of Programming, http://www.cs.toronto.edu/~hehner/aPToP/

Saturday, November 28, 2009

Back in the 60s

Since the beginning of Computing Science and computer programming, there has been an important evolution in the methods used to design software. We're now much better than we used to be to produce software. When I turn around and look at the Internet trend though, I have the impression of looking at a software seen of at least 40 years ago. Whereas people are now capable of choosing a programming language on account of the abstractions it allows them to formulate and use for desktop applications, it seems that web programming is still tightly tied with the individual idiosyncrasies of the browser that will be used to run them. Whenever I complain about the hazards of using Javascript, I get the answer that this is what gets executed in the browsers! Since I'm very enthusiastic about the research that I do, I don't usually refrain to mention the sorry state of affair of software and the thought that formal methods can be of great use for that. One year ago, I met a friend of a friend who is enthusiastic about web programming. When I suggested that we should be very careful about the properties that we select for our programs and the assumptions that their validity rely on, he pointed out that would be of very little help for web programming because of the important differences that exist between the browsers. Continuing in that direction, I got the strong feeling that he considered the variety of platforms to be one of the most important and most interesting challenges of web programming. I could not help but be remembered the accounts of the software scene of the 60s that I have read. The accidental complexity is so important that people mistake it for the core of the problem. In that respect, I guess this is no surprise that web applications are of such a poor quality. What triggered the present note is my suffering from the poor quality and and random behavior of Facebook, Google Wave and, while I type, I am reminded that Blogger is not much better. A final word, lest I am mistakenly assumed to be satisfied with the state of the art in non-browser based application: they appear much better in comparison to web application but there are important shortcomings also and I think that they are part of the problem of the web scene. Simon Hudon November 28th Meilen

Wednesday, November 4, 2009

Bad Separation of Concerns

In the course of my master's, I have to attend a seminar and prepare one talk. Last week, the talk I listened to was about using pattern matching in object oriented programming languages. I thought, as the talk went on, that all the yardstick they took to evaluate the design of their language construct was all wrong. It struggled, in my opinion in vain, to improve the expressiveness while retaining the best possible performances and remaining "concrete". In that respect, I tend to take for granted the necessity of expressing precise specifications with every piece of code. This means that, for me, the most important is that my specification language be expressive enough, but still very simple, to allow me to reason about my problems and my implementation language be as simple as possible and allow me to express any algorithm I wish. In my experience, the specification language should involve as little operational elements as possible for simplicity. As for the implementation language, it should be possible to implement it on any reasonable architecture. Before I'm objected that it is not customary in industry to use precise specification notations or to keep multiple representations, at different degrees of abstraction, of a single solution, I have to point out that this is an academic result and it does not have to follow industrial customs. Actually, I would go one step further and state that, if, in academia, we wait for techniques to become widely adopted before using them, we are lagging behind and risk doing no more than producing more of the same. Academia being more remote from the market place than industry are supposed to be less affected by the political issues that keep industry behind the state of the art. And this is not badmouthing the industry: it is understandable that they have more concerns than only technological ones. Let's get back to the paper. In their design, they seemed to tackle both problems of expressiveness and implementation at once and the result was not pretty. For example, in their pattern matching, they had to take into consideration that the evaluation of the expression (which, in their case, is an absolute necessity) they were dealing with could very well have side effects and this was an important concern of theirs. I think that this possibility was popularized by C and I think this is an important set back with respect to allowing programmers to think about their programs. I think, already, the discipline of separating queries from commands, which is part of Eiffel's style, is much better in this respect. They also had to deal with constructors in order to match objects like you would in Haskell. This is very ugly. Pattern matching is a very elegant way of writing expressive specifications but object creations do not fit in it. This is because an expression creating an object has an implicit reference to memory allocation and it can't be dissociated to it. People can't forget that the object has a memory representation and that a comparison in the context of pattern matching will have a given cost. I prefer to use mathematical expressions to express the value contained by the object and allocate the object only if necessary.

Aside In a recent programming experiment where I derived my code from its proof of correctness, I was parsing a string and, since all the properties of a language can so elegantly be formulated in terms of strings of character, I used strings to represent the state of the computation. I had my input string S, the processed string x and the left over string z and they were linked by the invariant:

S = x ++ z

with ++ the string concatenation operator. At each iteration, I split z in two parts such that

z = [ y ] ++ w

and scanning one character was encoded as:

x, z := x ++ [ y ], w

If you can't help yourself and have to imagine the memory footprint and performances of this algorithm, you might be scandalized that this is really inefficient. This is not meant to be executed as such; it is only a very simple way of understanding defining what the algorithm does. With one data refinement, I went to a representation where the use of memory is limited to the array representing the input, an integer index indicating which character y is and two booleans to represent the acceptance state of the algorithm. There is no efficiency problem with that, and my proof of correctness and my proof of data refinement are very simple and I consider them the clearest documentation for the algorithm.

end of aside

In summary, I think the right way of seeing that problem is:

Pattern Matching is a power way of associating properties or code with the shape of the data and we need to be able to express it. Since we can use it with code, it would be important to have programming techniques to make the executable, that is, techniques usable by a programmer to transform a specification using pattern matching into executable code implementing it. If the methods are especially effective and simple, we could have it implemented in our compiler but this is far from being a design consideration for the construct.

* * *

More recently, in my compiler design class, we were presented a research programming language which is said to be "relationship-based". I think the idea is a good one: they factor the inter-class relationship out of classes to treat them as separate modules. The question whether each such relationship deserve its own module is worthy of debate but, for now, what bothers me is that they treat it as a programming language construct, that is to say they keep an "implementability anchor" in their heads while they design the notation. One of the consequence is that they will refrain from adding anything for which they aren't sure they can compile it automatically. I would rather see this emerge as a specification structuring artifact and see some programming techniques emerge in connection to it so that the most common one can be implemented quite straightforwardly. For the rest, it's open to experimentation and, in all cases, a case by case implementation would allow optimization that systematic compilation wouldn't be able to do.

I think the big problem here is still the taking of current practices as a guideline for academic research: "since programmers don't write specifications, we won't provide a means to write good specifications even though progress would require it". I put it in my "more of the same" bin, look at the original ideas and forget about the language. Actually, I will do so when my semester is over.

Simon Hudon

ETH Zürich

finished on November 18th

Sunday, October 11, 2009

Modeling vs Programming vs Coding

I usually adhere to Dijkstra's and Morgan's view of programming: it's the action of passing from a precise specification to an executable program. I've had some contacts recently with the Chair of Programming Methodology recently and I was expecting them to have a similar view to mine on what constitutes programming. I was a little disappointed to see that their view was more closely related to "writing down executable code" than "reflecting on the design of a suitable implementation for a given specification". What they call programming, I would rather call coding. Before that, I had a short talk with Dr. Jean-Raymond Abrial, founder of the B and Event-B formal methods where I saw that he shunned programming. However, I came to realize that what he called programming was also coding. On the other hand, he called the activity of central importance to software engineers "modeling" which is a term I have been trying to avoid for some time, now, because of its central role in methods like UML, which I find eminently ignorable for software engineers; it simply seems to stem from confusing vagueness with abstraction and I think there's no reasons to accept vague documentation for a design. I still prefer the word programming for the formal derivation of implementations but I feel more and more like I will have to find a substitute because of the lexicon associated with programming. It is associated with programs which is related to routines and sequential execution whereas I believe that sequential vs parallel execution is an implementation choice that should not receive half of the importance that it is usually given in the development process. To be absolutely unambiguous, I could rename the activity "designing a software system" taking the liberty of changing software with discrete if necessary but it might be a bit too long. However, design is an activity so widespread that it does not feel appropriate to use it on its own when talking about programming. For this purpose, I could resort to "modeling" but I run the risk of having to precise all the time that I have nothing to do with UML or their bubbles and arrows techniques. Ah, well! I'll keep thinking about it. In the mean time, I'll keep programming for short and design of software system to be unambiguous. Simon Hudon October 11th, 2009 Meilen

Friday, September 18, 2009

Hehner's Program Calculus

Preamble This post, although involving unusual notions for most, is meant to be very basic and I expect most computer scientists, computer science students to be able to understand it. If you don't please let me know, I'll arrange to put enough background in it.

End of preamble

For some time, Prof. Jonathan Ostroff has been trying to convince me of the elegance of Hehner's method of refinement. Recently, I was giving examples of Dijkstra's program calculus. Among these was the derivation of the program for computing the maximum of two numbers and its proof. I eventually came around to consider this example terms of Hehner's method and I was struck with at how simple his formulation is. For example, the semantics of the following conditional statement is given in Hehner's formulation and in Dijkstra's (and Morgan's, for that matter).

if b -> x := E [] c -> x := F fi

Hehner (x' denotes the value of the variable x after execution whereas x denotes its initial value):

(b  /\  x' = E)  \/  (c /\ x' = F)

Dijkstra (it is defined as a function of the predicate p and it represents the weakest precondition that will allow the statement to enforce the postcondition p):

(b   ==>   (x := E).p)  /\  (c   ==>   (x := F).p)  /\  (b  \/  c)

Just here, we have a factor of a little bit less than two. It is a bit of an unfair comparison, though since the two formulae are not used in the same way. It is interesting to note, however, that Dijkstra's formulation includes a term (b \/ c) which he calls "the law of excluded miracle" (as a pun referring to "the law of excluded middle"). Its purpose is to make sure that one of the branches of the conditional applies whenever we get ready to execute the statement. On the other hand, Hehner's formulation implies (b \/ c) and it does not have to be included in the term. Unfortunately, it is necessary with Hehner's method to mention every variable that do not change which, as it appears to me, an amateur when it comes to Hehner's method, can be quite inconvenient.

Here is the development of the mentioned program with both methods. We start by stating the postulates that we have.

(0)   x max y  >=  x
(1)   x max y  >=  x
(2)   x max y  =  x   \/   x max y  =  y
(3)   x max y  =  y max x
(4)   x  =  x max y   ==   x  >=  y
(5)   y  =  x max y   ==   y  >=  x
(6)   x >= y  \/  y >= x

Now, here is the postcondition that we want to implement:

P:  z =  x max y

because of the form of (2) we could presume that the appropriate execution of either of:

Q:  z := x

and

R:  z := y

We must therefore investigate when it is appropriate to execute each. For that, we calculate the weakest precondition that allows each of them to establish the postcondition. We can do that by applying the statements Q and R as a subtitution over the postcondition P.

   Q.P
==    { Subtitution }
   x  =  x max y
==    { (4) }
   x >= y

(with == being the logical equivalence) Using a similar calculation, we get

R.P    ==    y >= x

We can conclude that the following code establishes the postcondition:

if  x >= y -> z := x 
[]  y >= x -> z := y 
fi

The only thing left to do is to apply the law of excluded miracle and make sure that there is always a branch which is executable.

   (x >= y)  \/  (y >= x)
==    { (6) }
   true

If we try the same derivation with Hehner's method, we can start by saying that the postcondition is trivially established by

P:  z := x max y

which, unfortunately, is not executable. For simplicity, we will assume that x is the only assignable variable. We have the following statement representing the semantics of P.

Q:  z'  =  x max y

We can now start calculating:

   z'  =  x max y
==    { Identity of /\ }
   z' = x max y  /\  true
==    { (2) }
   z' = x max y  /\  (x = x max y \/ y = x max y)
==    { /\ distributes over \/ }
   (z' = x max y  /\  x = x max y)  \/  (z' = x max y  /\  y = x max y)
==    { Leibniz, twice }
   (z' = x  /\  x = x max y)  \/  (z' = y  /\  y = x max y)
==    { (4) and (5) }
   (z' = x  /\  x >= y)  \/  (z' = y  /\  y >= x)

The final stage of the calculation represents the semantics of the following conditional:

if  x >= y -> z := x
[]  y >= x -> z := y
fi

What I like about the previous derivation is that it is very straight forward. There is a small invention involved which is to introduce true at the beginning and transform it into (2). It seems reasonable since it gives a simple expression for any possible value of x max y. Also, the proof contains a little too much detail. For instance, the first and the second steps could have been merged and explained simply by referring to assumption (2).

On a more personal side, I must admit that, for at least one year, I was very enthusiastic about Ralph Back's lattice theoretic interpretation of programs: it's just so elegant! In practice however, I can't help but being struck with awe at how straightforward the program derivation is with Hehner. Like Dijkstra said so often: we must not become enamored with our tools especially if they show how clever we have been. Back's interpretation introduced me to lattice theory which I still like and find pretty useful but using it for program calculus does not seem necessary and I should probably turn over to Hehner's program calculus even though it does not yell "I know lattice theory!"

Simon

September 18th 2009

Zürich

Monday, September 14, 2009

Trends in programming languages

I just saw a very insightful comment in the 2003 publication of the IFIP work group 2.3 on programming methodology about one of the paper it contains. I cannot attribute it for sure but since Annabelle McIver and Carroll Morgan are the editors, one of them must be at the origin of the comment. It says that in the field of programming languages, one can distinguish two trends. The first, by far the most popular, views a programming language as a formalism to abstract away from the specifics of the machines and the tiny optimizations that might be useful but are tedious to come up with and to maintain. It gives Fortran and Java as examples. Fortran allows the programmer to keep his concentration on other things than the evaluation of mathematical expressions which is quite useful. On the other hand, Java allows the programmer to put the implementation of classes in one place and the latter does not have to think about the specific details of the use of classes (e.g. dynamic binding and memory management). Personnally, I would call those coding languages: they make programming simpler than writing machine code. The second trend views a programming language as a formalism to help reflect on the problem at hand. I give here two of my own examples: Eiffel and B. The fact that they cover analysis, design and implementation is a telling sign that it does not support the view of a program as a sequence of instruction but as a computational solution to a problem. When trying to "sell" languages of the latter category, I often encounter objection stemming from the interlocutor's adherence to the first category and, most of the time, it ticks me off because I have the impression that they are focusing on the wrong aspect of a programming language. Having seen this description of the two trends, I'm thinking of questioning my interlocutor's yardstick instead of getting frustrated. Simon Hudon September 13th 2009 Zürich

Monday, August 17, 2009

Algorithms in Object Oriented Programming

Approximately two years ago, I was writing a set of classes for dealing with graphs. At one point, I had to provide facilities for depth-first and breadth-first traversal. At first, I thought of doing it as a higher order function: it would be a regular encoding of a graph traversal that would use an agent --in Eiffel talk, in C# talk, it would be called a delegate-- to process every node encountered. This seemed a lot of work to set up and use and I decided to create particular kinds of cursors that would enumerate the nodes in preorder, postorder or in breath-first order. It was a very interesting exercise because I transposed the state of the computation of the algorithm to make it the state of a class. At the time, it seemed like a very nice reinterpretation of an algorithm to have it fit a more object oriented context. I did not push that investigation further. At the moment, I am working on the development of a parsing algorithm and its correctness proof. One of the decision I took was to have a token reading primitive instead of passing a complete sequence of tokens. With respect to the formulation of the algorithm, there is very little differences but with respect to its interface, the difference is big enough to allow me to do an interesting observation. The reading can be interpreted as a routine called by the program implementing the parsing algorithm, but it could also be interpreted as a method of a class whose state is described by the local variables of the algorithm. This would allow, for example, to interpret keystrokes as tokens and feed them to the parser as they come. What if we just considered it as an input without regard to whether it is ordered by a client or by the module. With that view, we could have a scanner which outputs one token at a time and inputs one character at a time. We could plug the input of the parser on the output of the scanner for which the input could either be plugged on a streamed string or on the keyboard input without significant changes. Furthermore, if we wanted to decouple further the lexical analysis and the syntactic analysis, we could put a buffer between the two (still without changing the algorithms) and we could even put each machine on its own process. Some theoretical framework which is very useful to conceive that is that of action systems. With just a few exception, an action system could be understood as a canonical view of (not necessarily terminating) programs where we have one loop over a set of one line guarded commands. For example, for the following statement: a := 32 if b > c then b, c := c, b [] b <= c then skip fi We could get the following loop which we will consider as an action system: line := 0 ;do line = 0 then a, line := 32, 1 [] line = 1 /\ b > c then b, c, line := c, b, 2 [] line = 1 /\ b <= c then line := 2 od The advantage of using an action system instead of the corresponding program (when there is a nice one) is that the action system can compose for various purposes. If you want to simulate the concurrent execution of two programs, taking the union of the actions (the lines in the loop) of their action system will give you a model of how the two programs will execute. The assertion that you put in both programs must become invariants in the loop attached to line numbers. It is similar to the methods described in the Gries-Owiki theory and in A Method of Multiprogramming by Netty van Gasteren and Wim Feijen. The other use of the action system is that it can model objects. Each action can be understood as a method or as a method call and it allow one to model "concurrent" access to objects much better than what we have with structured programming geared with type bound procedures (what we see all the time in OO languages). Simon Hudon Finished on August 26th Zürich

Simon's blog prime