Monday, August 30, 2010

Modularity in Design
versus
Modularity in Code

At the moment, the widespread use of libraries is characterized by the reuse of code in various contexts outside the one which gave it birth. It has been made much easier by the development of .NET --which I salute as being a great technical achievement-- because .NET allows the combination of code written in different languages. I believe it is a stepping stone with respect to exposing programming languages as reasoning aids rather than execution models and it is only hampered by Microsoft's lack of commitment towards portability.

However, I suggest that, although it is an important step, it is not satisfactory as far as the concern for reuse goes. I propose that we start thinking of the reuse of the thoughts behind the code and that they are the things of foremost importance. It does not mean that reusing code is not important -- assuming we preserve the documentation of its interfaces-- but it does not go far enough. Indeed, in code, all the different ideas justifying the design choices are mangled together and one can hardly look at each aspect in isolation. In the community of programming languages, several attempts have been made to provide better modules but, so far, we still see a lot of code where different concerns end up intertwined with one another and I expect it will remain the case.

I think that a good modularity cannot be achieved in (what is understood today as) a programming language because of the universal assumption that the code a programmer types has to be directly (through compilation) executable.

My favorite example comes from parallel programming. In the field of programming language design, many mechanisms have been proposed to make the task of parallel programming easier. It includes semaphores, monitors and message passing. They make it easier to avoid low level design errors like race conditions but they still don't address design issues. In each case, the code is still organized in processes; since the behavior of a parallel program is really a property of the combination of the processes, a change to said behavior requires a change of many modules when the modular boundaries are drawn around processes.

Aside: I consider threads to be a special kind of processes and I don't think that, at the design level, differentiating between the two is of any help at all. I will therefore not mention threads again. (end of aside)

As a comparison, UNITY ([CM88]) uses action systems as the basic unit of composition. An action system has some local variables and can share variables with its environment. It also has a set of atomic actions that can modify the variables. The important thing is that it is possible to make assumptions about how the data is going to be manipulated, to enforce the assumptions and to use them to prove some safety and progress properties. When someone wants to change the functionality of a program, either he can add a new module built around the new properties or he can add the properties to an existing module, if it is close enough to what the module was built for. The key here is to see that programs properties (specifically safety and progress properties) are closely mapped to modules so we have a very good encapsulation.

At the end of a development, a UNITY system consists of a set of actions and a set of variables both partitioned between modules. If someone wants to execute the system, he needs first to choose the platform on which he wants to execute it (e.g. it could be a network of computers or a network of processes residing on one computer) and then partition the various variables and actions of the system between the components of the platform without consideration for the modular boundaries. Some variables might become channels between processes or between computers, others may be stored in the memory of a computer, private to one process or shared and possibly protected by locks.

In sequential programming, the control flow is something that can be closely tied to program properties and it can therefore be safely attributed to a modular unit --such as a procedure-- because the properties of interest --preconditions, postconditions and termination-- closely apply to syntactic structures describing the control flow. That is to say that pre- and postconditions are paired and, between them lies a sequence of statement which are structured in a proper fashion for proving correctness. Similarly, loops are associated with invariants and variants and nothing outside of the loop is relevant for reasoning about said loop.

In contrast, [FvG99] shows the kind of proof obligation one has to contend with when working with processes in parallel programming. In their theory, each assertion has local correctness, which is analogous to partial correctness in sequential programming, and a global correctness which relies on the structure of the other processes. At this point, it is unthinkable to take one process away and to expect that the properties one has proven about it will hold in a different environment. In short, the interface between processes, when made explicit, is too far from a thin interface to allow processes to properly encapsulate anything.

In UNITY, the fact that it is based on action systems rather than processes makes it possible to attribute a precise and small set of properties to a module and work in isolation on it. The usual process is called refinement and it makes all the intermediate design decisions explicit. It makes the effort spent on the development easier to transpose to different situation even when none of the code produced is actually useful.

The UNITY method has proof rules and formal semantics and the most rigorous approaches will take advantage of them. However, it is not necessary to use the whole battery to reap the benefits of the method. Since modularity is something they did well with UNITY, using it and its temporal logic as a guideline for design can already be beneficial in pointing out what we must be careful about.

In short, I believe that it is a property of paramount importance for modular boundaries to confine each programming task to the construction or modification of only one module. The best possible way to achieve that is to choose a unit of modularity for which the proofs of correctness need only rely on very few, simple and chosen a priori --that is, chosen as much as possible independently of any implementation-- assumptions of the surrounding environment. Some might say that it is an obvious goal but as far as I can see, very few modular decomposition mechanisms are satisfactory in that respect and I think that most language designers are still confined by "how the program is going to be executed" so are unable to meet my criteria.

*                         *
*

I know of another effort directed at the reuse of design: the so-called design patterns. The idea is a good one but its realization is much too weak. They are basically pattens of modular decomposition which have proved useful in many situations. However, there is no basis to state precisely what properties make them useful or how they meet those properties. In short, they cannot have an existence of their own because no notion of correctness apply to them. I do recognize it as a step in the right direction but it seems over-hyped for what it provides. I would welcome the publication of such books where designs are proven correct, where it is (formally) precise where the modules fit in a larger design and how one should proceed to code them correctly.

Next, I would have like to address some of the objections to formal methods that I have heard again and again and that used to unsettle me. However, this post is already pretty long and, by now, it is long overdue. I will, therefore, leave it for a post which, hopefully, will be coming soon.

Simon Hudon
April 10th, 2011
Meilen

References

[CM88] K.M. Chandy and J. Misra, Parallel Program Design:
       A Foundation, Addison-Wesley, 1988

[FvG99] W.H.J. Feijen, A.J.M. van Gasteren, On a Method of
       Multiprogramming, Springer-Verlag New York, Inc.,
       1999

Thursday, August 5, 2010

The Development of a Solution to the Problem of the Cubes

Note:
Before I begin, let me point out that the publication of this blog entry is made possible because I haven't used Blogger's interface to edit it (or at least, I used it as little as possible). Instead, I wrote it in a plain text editor using a markup language I am developping. Since I don't have developed an HTML generator yet, I had to convert it myself to HTML. Blogger did not make it easy but I think the result is much better than what I got previously by just fighting with the web interface. I would certainly appreciate any feedback concerning the resulting layout.
(end of note)

The first statement of this problem that I have seen is in Wim Feijen's note WF114 [0]. I really like the structure of his presentation but, whenever I have reproduced it, either for myself of to present the problem and a method to find a solution to somebody else, I have made on slight change: instead of using predicate transformers semantics, I use a relational semantics and I believe the switch to be quite significant from a practical standpoint. I still use the invariant based loop development techniques developped by Dijkstra & cie instead of the recursion / precondition technique of Hehner and I think it is very characteristic for my style. It shows that I prefer to manipulate one small part of the program at a time.

DEVELOPMENT ===========

The problem at hand consists in producing an array with the N first cubes without resorting to multiplication operations. Formally, the postcondition we are trying to satisfy is:
 P0:   (A i: 0 ≤ i < N: f.i = i^3)
with f the output variable (a function). About notation '.' is the function application operator and (A i: R: T) is a universal quantification over the range R and with the term T. Here, we use the property that ≤ and < are mutually conjunctional which means that their combined application can be unfolded into
 a ≤ b < c  ≡  a ≤ b ∧ b < c
They are also mutually conjunctional with equality (=) whereas logical equivalence (≡) is not conjunctinal at all because it is associative and the two features of the operator would interfere. (end of note) We can now proceed to find a suitable invariant for our loop. A good technique to do so is to replace some constants in the postcondition by variables. Since it is a quantification, it seems reasonable to focus our search in the range. We find 0 and N. We could choose either or both but if we use 0 as our initial value and N as the final one, we get the first cube for free. We will therefore opt for replacing only N with a variable:
 E0:   k = N
 J0:   (A i: 0 ≤ i < k: f.i = i^3)
If we start k off at 0, the range of J0 is becomes false and the invariant is therefore trivially true. From there, it is not a big step to say that increasing k regularly will lead us to E0 (our variant is then N-k).
 I0:   k = 0
 S0:   k' = k + 1
It is now time to see how we will maintain J0 (in the loop) while applying S0 to converge. Note on proof obligations If we use the technique of the invariant to build our loop, we have to fulfill a certain certain proof obligation. Assuming that J is the invariant, B is the body of the loop, E is the exit condition, IN is the initialization of the loop and P is the postcondition:
 (0)   J'  ⇐  J ∧ B ∧ ¬ E    "B (and E) preserve(¬s) J"
 (1)   J  ⇐  IN              "IN establishes J"
 (2)   P  ⇐  J ∧ E           "J and E lead to P"
In the preceding definitions, primed predicates are predicates for which all unprimed program variables are replaced with their primed version. Unprimed variables designate their initial values (before the execution of a given statement or program) and the primed variables designate their final values.
 (3)     (B and E preserve J) 
       ∧ (IN establishes J) 
       ∧ (J and E lead to P)
    ≡ 
       { IN } until E do B od { P }
The last line is to be interpreted as a Hoare triple with assertions between brackets. It reads "started in a state satisficing IN, the program [here the while loop], finishes in a state satisficing P." (end of note)
   (A i: 0 ≤ i < k': f'.i = i^3)
 =   { S0 }
   (A i: 0 ≤ i < k + 1: f'.i = i^3)
 =   { Split off one term }
   (A i: 0 ≤ i < k: f'.i = i^3) ∧ f'.k = k^3
 =   { Modify only f.k; S1: (A i: 0 ≤ i < k: f'.i = f.i) }
   J0 ∧ f'.k = k^3
 S1:   (A i: 0 ≤ i < k: f'.i = f.i)
The last line of the provious proof makes a nice assignment. The only problem is that it requires some multiplications, we will continue out calculations rather than adopt it as our next assignment.
   f'.k = k^3
 =   { Let's introduce a new variable to hold k^3 }
     { at all time.  J1: a = k^3                  }
   f'.k = a
 S2:   f'.k = a
 J1:   a = k^3
If we added the symmetric complement of S1, that is that everything beyond k in the values of f stays unchanged but it is not required yet. We add it afterwards to complement S1 and S2 as array assignments but it is not urgent.
 (4)   (J0'  ≡  J0 ∧ S2)  ⇐  J1 ∧ S0 ∧ S1
Since the left argument of ⇐ is monotonic, we can weaken (4) by weakening its left-hand side and replacing the equivalence by a consequence (⇐). After applying the shunting rule (see (A) in the appendix), we have exactly the shape of the proof obligation for maintaining an invariant... except that invariant used to maintain J0 (i.e. J0 ∧ J1) is much stronger than J0. We can prove that we can maintain it too, though.
 (5)   J0'  ⇐  J0 ∧ J1 ∧ S0 ∧ S1 ∧ S2
To maintain J1, we will split it in two, manipulate the left-hand side of the equality since we know a lot about k and hope it will lead to operations on a.
   k'^3
 =   { S0 }
   (k + 1)^3
 =   { Unfold _ ^3; ∙ over + }
   k^3 + 3∙k^2 + 3∙k + 1
 =   { J1 }
   a + 3∙k^2 + 3∙k + 1
 =   { We need a new variable to hold the term          }
     { which contains a product J2: b = 3∙k^2 + 3∙k + 1 }
   a + b
 =   { We want to fall back on J1'.  S3: a' = a + b is  }
     { good candidate.                                  }
   a'
 J2:   b = 3∙k^2 + 3∙k + 1
 S3:   a' = a + b
 (6)   J1'  ⇐  J1 ∧ J2 ∧ S0 ∧ S3
It seems like we have just created some more work for ourselves but we could be reasured by the sight of the decreasing order of the equations in our new invariants.
   3∙k'^2 + 3∙k' + 1
 =   { S0 }
   3∙(k + 1)^2 + 3∙(k + 1) + 1
 =   { ∙ over + }
   3∙k^2 + 6∙k + 3 + 3∙k + 3 + 1
 =   { J2 }
   b + 6∙k + 6
 =   { You know the drill! J3: c = 6∙k + 6 }
   b + c
 =   { S4: b' = b + c }
   b'
 J3:   c = 6∙k + 6
 S4:   b' = b + c
 (7)   J2' ⇐  J2 ∧ J3 ∧ S0 ∧ S4
Now for the last touch on the invariant:
   6∙k' + 6
 =   { S0 }
   6∙k + 6 + 6
 =   { J3 }
   c + 6
 =   { S5: c' = c + 6 }
   c'
 S5:   c' = c + 6
 (8)   J3'  ⇐  J3 ∧ S0 ∧ S5
We can now happily conclude
 (9)   S6 preserves J4
with
 S6:   S0 ∧ S1 ∧ S2 ∧ S3 ∧ S4 ∧ S5
 J4:   J0 ∧ J1 ∧ J2 ∧ J3
We started off knowing that J0 ∧ E0 ⇒ P0 so all that is left for a correct loop is to find the initialization. We already have I0 and, from there, we can easily calculate the initial values of the auxiliary variables. Let's ignore the calculations and see the initial values right away.
 I1:   a = 0
 I2:   b = 1
 I3:   c = 6
We can see that there is no value specified for f and, as a matter of fact, any value will do. We now have the final lemma for the correctness of this loop.
 (10)  I0 ∧ I1 ∧ I2 ∧ I3 establishes J0 ∧ J1 ∧ J2 ∧ J3
CONCLUSION ========== Looking at the end result, we can see that we ended up with a series of assignments (SX) but we haven't specified an order in which they have to be executed. It is because it is both simple and tedious to come up with an order and the order so arbitrary that it is all noise. For that reason, we can trust a compiler to do it for us. When we have to deal with a programming languages that doesn't support multiple simultanous assignments, we have to take the time to introduce the noise that is the order ourselves. The key here is that we must not overwrite a variable before we are done with its initial value. We have a bit of a problem when we run into an assignment where the use and the writing of variables are cyclicly dependant like in the following:
 S6:   x' = y ∧ y' = x
To solve the problem, we can add an auxiliary variable (say, t) and initialize it so that it can be substituted for every use of a certain variable.
 I4:   t = x
   x' = y ∧ y' = x
 =   { I4 }
   x' = y ∧ y' = t
 =   { Assignment law (B) }
   (x := y) ; (x' = x ∧ y' = t)
 =   { Assignment law (B) }
   (x := y) ; (y := t) ; (x' = x ∧ y' = y)
 ⇐    { Skip introduction (C) }
   (x := y) ; (y := t) ; skip
 =    { Identity of ; (D) }
   (x := y) ; (y := t)
And, of course, it is also simple to implement I4 and prepend it to the program we just derived. Note: Since we are interested in a program that ends up with I4, we are going to use primed variables. (end of note)
   t' = x ∧ x' = x ∧ y' = y
 =   { Assignment law (B) }
   (t := x) ; (t' = t ∧ x' = x ∧ y' = y)
 ⇐   { Skip introduction (C) }
   (t := x) ; skip
 =   { Identity of ; (D) }
   t := x
In short, this is why I don't bother to do it with the body of my loop. Finally, the main interest of this programming exercise does not lie in the resulting program. Quite frankly, it accomplishes quite a banal task. However, the technique which led us to it is quite systematic. We used very little knowledge to find it and, unlike Feijen's version, we did not introduce assignments of '?' to mark the necessity to create a new assignments to then 'guess' what should be the expression. To be honest, it is not a very hard guess but it is still outside of the scope of a calculation. The fact that I integrated that choice in a calculation makes it easier to justify. Regards, Simon Hudon ETH Zürich APPENDIX ======== Unusual laws of propositional calculus
(A)   A ∧ B ⇒ C  ≡  A ⇒ (B ⇒ C)    { "shunting" }
Hehner's programming laws [1]
(B)   (x := E) ; P  ≡  P [x \ E]     
              { "assignment law" P [x \ E] is the }
              { variable substitution notation,   }
              { P is a predicate, x a program     }
              { variable and E an expression      }
(C)   x' = x  ⇐   skip              
              { For any program variable x }
(D)   P ; skip  ≡   P                
              { For any program predicate P }
REFERENCES ========== [0] Wim Feijen, WF114 - The problem of the table of cubes, http://www.mathmeth.com/wf/files/wf1xx/wf114.pdf [1] Eric Hehner, A Practical Theory of Programming, http://www.cs.toronto.edu/~hehner/aPToP/

Sunday, August 1, 2010

A Prehistoric Qualification of Languages: Succinctness

I read this morning the essay Succinctness = Power [0] by Paul Graham which, so far, I find (note the present tense) to be quite smart. I can't believe he would use such a metric as lines of code or number of symbols as a meaningful metric to compare implementations with. He says that one of the goal of high level programming languages is to be more succinct for expressing a given amount of machine code. Of all people, I would expect that somebody who spent so much time with LISP would see the the fallacy. The idea and benefit of a high level language is that it allows you to think in higher level IDEAS. Those ideas can be implemented using half a machine code instruction or a thousand, it is of no relevance for its use. What IS relevant is whether the properties of that particular abstraction are good enough to fit in the context of the solutions but include very little (I am tempted to say none at all) information which does not pertain to the context it will be used in. And, finally, this is something LISP is interesting for because it drives home the message that those particular abstractions can be encapsulated in functions or data types but you can also build up the language to cover it too and It is not too say I prefer long solutions but I see succinctness as a consequence --and not the main one at that-- of a language's "power" (which I find to be a poor choice of adjective especially since he does not define how he uses it right away). Its real power lies in the variety of abstractions that one can define and use. If you exaggerate the aim for succinctness, you end up with APL or PERL which are notoriously bad. Their goal is not abstraction but packing as many operations in tiny expressions. As a consequence, those solutions are unreadable and they just seem like dignified way of writing short machine code. It gives no more flexibility and no more readability. This comment was mostly about the beginning the essay. I don't have much to say about the rest except that, at some points, he can expose some really nice insights but it really started off on the foot and I didn't expect the rest of the essay to compensate for that and, in my view, it didn't. [0] Paul Graham, Succinctness = Power, http://www.paulgraham.com/power.html Regards, Simon Hudon ETH Zürich August 6th 2010