Response by Guy Steele

I agree completely with the claim in Anurag Mendhekar's response that much of the problem lies in the composition mechanisms. Whether we code abstractions so as to present extra control features to a client, or provide optimizers that can weld abstraction and client together more tightly after that fact, this observation clearly indicates that there is more to the problem than just getting interfaces between precompiled binaries right.

It would be wonderful if we could construct various kinds of composition operators, which themselves might have to be subject to the requirements of open implementation. There are several ways to compose two suboperations, for example, such that both suboperations are executed. With respect to performance, Occam provides both SEQ and PAR, which require differing computational resources (processors) and have differing expected execution times (A+B versus MAX(A,B)+). Composing with or without optimization yields different behavior with respect to debuggability using the usual assembly-language or statement-level debugging technology. And so on.

A particularly common phenomenon is that the manner in which an operation is best carried out can vary dramatically depending on whether it is to be carried out once, 10 times, or a million times. There may be opportunities for hoisting constant parts of the computation so that they are precomputed once and used many times. There also may be opportunities for pipelining of various kinds. The Connection Machine Scientific Subroutine Library makes this explicit in its interfaces; it differs from the usual SSL primarily in that there is a second version of each routine in which every array has an extra axis, so you can specify many copies of a problem at once. This allows the CMSSL to make run-time decisions as to which of many algorithms to use, depending on how many copies of a problem are to be carried out and how the input data happens to be distributed among processor memories.

For some purposes, it might be even better if the CMSSL library routines were ``inline,'' at least the topmost routine that makes the decisions as to which algoruthm to use. An awful lot can be accomplished by using a functional language with some control over inlining---unfortunately most programming languages don't support full use of functions as arguments and as result values of procedures. For an extreme example of this, see my paper in the 1994 POPL, which I view as a massive enginneering job on Wadler's use of monads to describe small Lisp interpreters; the result was a small set of modules that could be combined in almost any configuration to produce an interpreter with exactly the set of features contributed by those modules. There were three keys points here:

  1. Every interface had a ``hook'' built in that allowed another client module to modify the input and output data representations and their associated operations, and composing two modules preserves this property.
  2. All modules are combined purely through functional (i.e., procedure-call) composition; a post-optimizer (mostly just beta-convesion of lambda expressions) then eliminates all the ``administrative'' code associated with the interface discipline, resulting in a tightly-coded interpreter in the usual style with no traces left of the interface discipline (and thus no overhead).
  3. The use of monads as an interface structure made it easier to preserve program invariants; in particular, the proper use of the interface discipline could be enforced by the Haskell type system.
In summary, I think that one useful tool for constructing open implementations is to have a framework for allowing implementation information, most importantly information about the context of use, leak across the interface boundary in a manner that is disciplined, type-checkable, and amenable to automation and automated post-optimization. One such approach can use ``good old procedure calls'' and fairly straightforward program optimization technqieus, but that is not necessary the only or the best approach. For example, a common situation is that when two modules are composed, each contributes a looping or ``FORALL'' construct; the best performance is then gained by rearranging the iteration nest in a manner that cannot be exposed until after the composition has occurred. This might be difficult to express clearly in a functional language.

I have concentrated on examples involving performance because that is an area of specific interest to me. I suspect that the ideas I propose to explore may not solve other OI issues. I look forward to discussing them this weekend with you all.

Back to Alphabetical List of Responses

Back to Main OI Workshop Page

Back to OI Home Page


Guy Steele, gls@livia.east.sun.com

(Last Revised October 1994)