Simon's blog prime: language design

Showing posts with label language design. Show all posts

Sunday, August 1, 2010

A Prehistoric Qualification of Languages: Succinctness

I read this morning the essay Succinctness = Power [0] by Paul Graham which, so far, I find (note the present tense) to be quite smart. I can't believe he would use such a metric as lines of code or number of symbols as a meaningful metric to compare implementations with. He says that one of the goal of high level programming languages is to be more succinct for expressing a given amount of machine code. Of all people, I would expect that somebody who spent so much time with LISP would see the the fallacy. The idea and benefit of a high level language is that it allows you to think in higher level IDEAS. Those ideas can be implemented using half a machine code instruction or a thousand, it is of no relevance for its use. What IS relevant is whether the properties of that particular abstraction are good enough to fit in the context of the solutions but include very little (I am tempted to say none at all) information which does not pertain to the context it will be used in. And, finally, this is something LISP is interesting for because it drives home the message that those particular abstractions can be encapsulated in functions or data types but you can also build up the language to cover it too and It is not too say I prefer long solutions but I see succinctness as a consequence --and not the main one at that-- of a language's "power" (which I find to be a poor choice of adjective especially since he does not define how he uses it right away). Its real power lies in the variety of abstractions that one can define and use. If you exaggerate the aim for succinctness, you end up with APL or PERL which are notoriously bad. Their goal is not abstraction but packing as many operations in tiny expressions. As a consequence, those solutions are unreadable and they just seem like dignified way of writing short machine code. It gives no more flexibility and no more readability. This comment was mostly about the beginning the essay. I don't have much to say about the rest except that, at some points, he can expose some really nice insights but it really started off on the foot and I didn't expect the rest of the essay to compensate for that and, in my view, it didn't. [0] Paul Graham, Succinctness = Power, http://www.paulgraham.com/power.html Regards, Simon Hudon ETH Zürich August 6th 2010

Wednesday, November 4, 2009

Bad Separation of Concerns

In the course of my master's, I have to attend a seminar and prepare one talk. Last week, the talk I listened to was about using pattern matching in object oriented programming languages. I thought, as the talk went on, that all the yardstick they took to evaluate the design of their language construct was all wrong. It struggled, in my opinion in vain, to improve the expressiveness while retaining the best possible performances and remaining "concrete". In that respect, I tend to take for granted the necessity of expressing precise specifications with every piece of code. This means that, for me, the most important is that my specification language be expressive enough, but still very simple, to allow me to reason about my problems and my implementation language be as simple as possible and allow me to express any algorithm I wish. In my experience, the specification language should involve as little operational elements as possible for simplicity. As for the implementation language, it should be possible to implement it on any reasonable architecture. Before I'm objected that it is not customary in industry to use precise specification notations or to keep multiple representations, at different degrees of abstraction, of a single solution, I have to point out that this is an academic result and it does not have to follow industrial customs. Actually, I would go one step further and state that, if, in academia, we wait for techniques to become widely adopted before using them, we are lagging behind and risk doing no more than producing more of the same. Academia being more remote from the market place than industry are supposed to be less affected by the political issues that keep industry behind the state of the art. And this is not badmouthing the industry: it is understandable that they have more concerns than only technological ones. Let's get back to the paper. In their design, they seemed to tackle both problems of expressiveness and implementation at once and the result was not pretty. For example, in their pattern matching, they had to take into consideration that the evaluation of the expression (which, in their case, is an absolute necessity) they were dealing with could very well have side effects and this was an important concern of theirs. I think that this possibility was popularized by C and I think this is an important set back with respect to allowing programmers to think about their programs. I think, already, the discipline of separating queries from commands, which is part of Eiffel's style, is much better in this respect. They also had to deal with constructors in order to match objects like you would in Haskell. This is very ugly. Pattern matching is a very elegant way of writing expressive specifications but object creations do not fit in it. This is because an expression creating an object has an implicit reference to memory allocation and it can't be dissociated to it. People can't forget that the object has a memory representation and that a comparison in the context of pattern matching will have a given cost. I prefer to use mathematical expressions to express the value contained by the object and allocate the object only if necessary.

Aside In a recent programming experiment where I derived my code from its proof of correctness, I was parsing a string and, since all the properties of a language can so elegantly be formulated in terms of strings of character, I used strings to represent the state of the computation. I had my input string S, the processed string x and the left over string z and they were linked by the invariant:

S = x ++ z

with ++ the string concatenation operator. At each iteration, I split z in two parts such that

z = [ y ] ++ w

and scanning one character was encoded as:

x, z := x ++ [ y ], w

If you can't help yourself and have to imagine the memory footprint and performances of this algorithm, you might be scandalized that this is really inefficient. This is not meant to be executed as such; it is only a very simple way of understanding defining what the algorithm does. With one data refinement, I went to a representation where the use of memory is limited to the array representing the input, an integer index indicating which character y is and two booleans to represent the acceptance state of the algorithm. There is no efficiency problem with that, and my proof of correctness and my proof of data refinement are very simple and I consider them the clearest documentation for the algorithm.

end of aside

In summary, I think the right way of seeing that problem is:

Pattern Matching is a power way of associating properties or code with the shape of the data and we need to be able to express it. Since we can use it with code, it would be important to have programming techniques to make the executable, that is, techniques usable by a programmer to transform a specification using pattern matching into executable code implementing it. If the methods are especially effective and simple, we could have it implemented in our compiler but this is far from being a design consideration for the construct.

* * *

More recently, in my compiler design class, we were presented a research programming language which is said to be "relationship-based". I think the idea is a good one: they factor the inter-class relationship out of classes to treat them as separate modules. The question whether each such relationship deserve its own module is worthy of debate but, for now, what bothers me is that they treat it as a programming language construct, that is to say they keep an "implementability anchor" in their heads while they design the notation. One of the consequence is that they will refrain from adding anything for which they aren't sure they can compile it automatically. I would rather see this emerge as a specification structuring artifact and see some programming techniques emerge in connection to it so that the most common one can be implemented quite straightforwardly. For the rest, it's open to experimentation and, in all cases, a case by case implementation would allow optimization that systematic compilation wouldn't be able to do.

I think the big problem here is still the taking of current practices as a guideline for academic research: "since programmers don't write specifications, we won't provide a means to write good specifications even though progress would require it". I put it in my "more of the same" bin, look at the original ideas and forget about the language. Actually, I will do so when my semester is over.

Simon Hudon

ETH Zürich

finished on November 18th

Monday, September 14, 2009

Trends in programming languages

I just saw a very insightful comment in the 2003 publication of the IFIP work group 2.3 on programming methodology about one of the paper it contains. I cannot attribute it for sure but since Annabelle McIver and Carroll Morgan are the editors, one of them must be at the origin of the comment. It says that in the field of programming languages, one can distinguish two trends. The first, by far the most popular, views a programming language as a formalism to abstract away from the specifics of the machines and the tiny optimizations that might be useful but are tedious to come up with and to maintain. It gives Fortran and Java as examples. Fortran allows the programmer to keep his concentration on other things than the evaluation of mathematical expressions which is quite useful. On the other hand, Java allows the programmer to put the implementation of classes in one place and the latter does not have to think about the specific details of the use of classes (e.g. dynamic binding and memory management). Personnally, I would call those coding languages: they make programming simpler than writing machine code. The second trend views a programming language as a formalism to help reflect on the problem at hand. I give here two of my own examples: Eiffel and B. The fact that they cover analysis, design and implementation is a telling sign that it does not support the view of a program as a sequence of instruction but as a computational solution to a problem. When trying to "sell" languages of the latter category, I often encounter objection stemming from the interlocutor's adherence to the first category and, most of the time, it ticks me off because I have the impression that they are focusing on the wrong aspect of a programming language. Having seen this description of the two trends, I'm thinking of questioning my interlocutor's yardstick instead of getting frustrated. Simon Hudon September 13th 2009 Zürich

Friday, July 10, 2009

On Programming Languages

Sometimes, some programming languages popularize powerful ways of thinking about a problem. It is not too rare however that a language would create popular way of obfuscating a problem. APL and C++ are an example of the former. The difference between the two is that C++ allow you to believe you actually understand the problem when you don't have a clue about it. On the other hand, I have never met anybody knowing APL that claimed to be able to read an APL program. Simon Hudon July 10th 2009 Zürich

Simon's blog prime

Sunday, August 1, 2010

A Prehistoric Qualification of Languages: Succinctness

Wednesday, November 4, 2009

Bad Separation of Concerns

Monday, September 14, 2009

Trends in programming languages

Friday, July 10, 2009

On Programming Languages

About Me

Labels

Followers

Interesting essay collections and blogs

Blog Archive