Filed in: Ideas.MarkedUpSourceCode · Modified on : Sun, 11 Oct 09
There is a problem with source code. Source code has exactly two audiences which must both be catered to simultaneously: humans, and machines. However, the main audience is machines, as non-functioning code is worthless.
Herein, I propose MUSC: Marked-Up Source Code, where ordinary source is embiggened with markup!
The main goal is to separate the machine parts from the human parts. The source is meant for the machine, and the markup is meant for the human. Yet, while one is meant for a particular audience, the other part is not excluded from being read by its unintended audience; i.e., the source should be readable by a human, and the markup readable by machine.
For example, let's take this spicy Python sample:
Instead, it should be represented as this, in MUSC:
(Note that I might have unintentionally/lazily not escaped XML entities.)
So far, not too impressive. It's basically just the code with bits of the AST converted to XML. However, we've added something in the docstring for foo: by marking bar and baz as parameters to the function, we could link back to the function definition. This of course has already been done 100 times: RST, ctags, epydoc, javadoc, etc. But this is just the beginning:
Furthermore, we could perform various transforms on the code, afforded to us by the magic of XML:
Obviously, as mentioned above, there are systems that do most of the sort of thing I'm talking about by embedding things in comments or docstrings using yet another markup language (not necessarily YAML, which I think is rather nice). But they've all got their own syntax to learn, their own escape codes, and their own way to embed into the programming language. Which means you have to have a parser for each tool. MUSC would consolidate features that are common to most languages into the MUSC core and outsource language-specific things to language-specific namespaces and plugins.
Part of the problem lies with the way current solutions attempt to shoehorn something into the language they're augmenting: javadocs live in special comments (like ctags), and RST lives in docstrings (which might as well be special comments, except that they're added to the object for introspection purposes). The code is a first-class citizen and the metadata is second-class. With MUSC, the code and the metadata are on the same level: neither is nested inside the other.