2008-04-22

Legislative Programming

Don Knuth's Literate Programming has shown us how to match the way a program's entrails are documented with the way the programmer thinks of them, time-wise. When the programmer thinks well, this can result in better programs than what would happen when the programmer would strictly follow the language-compelled structure. Unfortunately, the original WEB's paradigm comes from a simpler era, when a program was just a program, and the programmer's primary work was to write the program.

But within the intervening 30 years, the scope of a programmer's work has widened considerably. Programmer doesn't just write The Program anymore; programmers develop systems via applying the art of programming. This means they also:
  1. design inter-program interfaces
  2. write add-ons to existing programs
  3. even write throwaway code for the express purpose of generating "real" code
  4. construct white box test cases
All of that work is a part of programmer's work, may introduce complicated constructs that should be documented, and merits documented along with The Program itself. Thus, while Knuth's original intention --
The practitioner of literate programming can be regarded as an essayist, whose main concern is with exposition and excellence of style. Such an author, with thesaurus in hand, chooses the names of variables carefully and explains what each variable means. He or she strives for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding, using a mixture of formal and informal methods that reinforce each other.
-- is good, it is not complete enough.

Given that the scope of a programming job is considerably wider than the construction of a source file, I believe the scope of a literate programming document should be expanded to transcript of whatever the programmer needs to do to resolve this problem. The main difference from a mere log of programmer's activities would be that the document wouldn't need to describe false starts or bugs in detail, although it could caution against the more devious ones. Because such a document could easily be thought of in the style of An Act of the College of Programmers to Develop a System to Gleefully Greet the Globe, I'm jokingly calling this Legislative Programming for now, but a better name will probably be found eventually.

A legislative programming system should be capable of both tangling and weaving, as is Knuth's original system and almost all of its known descendants. However, it should go somewhat further:
  1. It should allow specifying how external commands -- such as the C translator -- are to be invoked. Integration with make or an analogous tool might be desirable. This would define a process of executing the job; the results of such an execution would be all the files the programmer sought to produce -- including the executable, possibly within a container such as JAR or WAR for Java projects.
  2. It should allow specifying several text files in various languages. Many WEB's descendants, such as noweb, already do this.
  3. While it's essential that the programming job be weavable into a single, all-encompassing architectural document, the system should also support creation of daughter documents which should, of course, appropriately appear in the architectural document. Such documents should be used to document specific interfaces of the software system in question -- such as the command line, any configuration parameters or extensibility.
  4. It should provide a convenient approach for building on earlier projects. WEB's change mechanism solves a special case of this problem, offering a way to adapt the underlying system to specific platforms. When a Program Is Just a Program, but there are many systems, this is appropriate. However, by 2008, everybody and his grandma runs some sort of POSIX/Java/ANSI C capable system, which provides for fairly common shared platforms, so adaptation is no longer the primary way of code reuse. Nowadays, the primary ways of code reuse are software extension and aggregation, and neither is easy (or even appropriate) to do via WEB's change mechanism.
  5. It should provide a convenient approach for building on later projects.  For example, consider Forth, a language commonly used for software extensibility.  Let's suppose I've built a framework providing the common Forth services.  A later project exploiting this framework would necessarily want to add new variables, words or perhaps a new interface to virtual files.  To that end, the framework project should be structured to use clean interface for such new definitions, and it's the responsibility of the legislative programming system to facilitate this.  The best way to attain this I currently know involves treating every 'programming job' as an object of a Self-like programming language, and defining composition and prototype-based "inheritance" on these objects.
In Knuth's system, there is a considerable emphasis on prettyprinting. While it's a useful feature, I do not consider it essential, as its influence on the way the programmer works is not very significant.  Furthermore, supporting a wide number of languages makes proper prettyprinting expensive to implement.

I used to consider it important to reveal the structure of the literate programming 'source' file through a DOM-like API to client code, but experience has now convinced me this, too, is a luxury of minor practical utility.  Besides, most benefits of such a system can be achieved by a good throwaway script support.

No comments: