As we know, Xing Is Not Gaming. Last night mulletron, Odd_Bloke and myself spent a good 8 hours peering at the newly released source for Sun's javac. I had personally been putting off looking at the code, not only because it has an odd signup procedure but also because it could so easily distract me from finishing my own compiler project.

Our goal is implement 'map' functionality similar to the way map works in Python. Our reasoning was that if we could do this then we could add other higher order functions such as filter and fold.

We rejected a number of possible syntaxes that added more reserved words to the language (very bad) or violated Wall's First Law of Programming Language Redesign before settling on for <Iterable> do <method>. I like this syntax because of the syntax highlightable infix between the operands. Inside 'real' code, one might encounter it like so:

import java.util.*;
public class Xing {
    public static void main(String args[]) {
        List<Integer> myList = new ArrayList<Integer>();
        for myList do print;
    public static void print(int i) {

We've encountered a few problems though. Most people know that Java's generics are actually shorthand for writing out the old-style for-loop in conjunction with an Iterator object, and the modern syntax is converted to the older one by a process known as de-sugaring. We planned to implement map functionality by de-sugaring our syntax into the new-style for call (and then let the existing code de-sugar that to the old-style).

However, the generic-based for loop is de-sugared after dataflow analysis and semantic checking has occured, which means that we have to implement these (and various other features) for the map functionality as well, which is decidedly non-trivial in something like javacc. Hopefully I can make it to Qing this evening to finish it off. Qing Is Not Gaming either, if you hadn't guessed.

Anyway, it turns out that the javacc code is messy. Really really messy. But it's the source of great amusement though, not only from the scary amount of no-op casts, misleading indenting and undocumented functions, but the lexical token for the '@' symbol is 'MONKEYS_AT'. No, we have no idea either.


Comments (2)

Actually, the indentation is consistent if you have your tabs set at 8 spaces, where God intended them.

There's a story behind MONKEYS_AT, and if you know it this little piece of code is a funny inside joke. But if you want me to tell you, you'll have to take back your assertion that javac's code is messy and tell me that it's a work of art. ;-)

Jan. 14, 2007, 12:58 a.m. #

Hi lamby,
do you know the <a href="…" rel="nofollow">Kitchen sink language</a> project ?
Its an experimental branch of Sun's Java™ compiler in which anyone can
commit its own language changes.
About your de-sugar proplem,
i think you can do your rewriting during the MemberEnter pass, i.e.
before the Attribute and Lower pass.


Jan. 18, 2007, 8:36 a.m. #