sh. It was called Erlightning, team The Blue
Avengers. (Previous edit of this page said 385 lines of Erlang - I
forgot to count the header file).
We downloaded the task, talked about it for a while, and chose to use a list of{Decorations, Text} tuples for our basic datatype. Then Joe wrote a parser, Robert wrote a whitespace eliminator, and I wrote an outputer. We put these bits together and tested them and it seemed to work pretty well. Now we had a simple functioning program to work from.
sh hacking to establish some invariants demanded by the judgement
criteria:
| whitespace compression | We implement this. |
| redundancy elimintation | This disappears in parsing, since the parse tree contains the "meaning" of the document, not a tag tree. |
| overlap inversion | Our tag reordering achieves this, but it's limited by the lookahead. |
| PL shortcut | We don't do this at all, which is the major limitation of our program. There is never a PL tag in our output. |
| whitespace simplification | We implement this, same code as whitespace compression. |
| EM elimination | Like redundancy elimination, we get this in parsing. |
| color nesting | We do this, though it ties in with tag reordering and so it's limited by the lookahead. |
When the program was working, we made a random input generator that produces arbitrarily large and deeply nested files, and this squeezed a bug or two out. Thousands of random files of random sizes is also a much more interesting thing to try optimising than just example.txt (which it seems a simple algorithm can perfectly optimise, except for the PL opportunities at the end - or so says my attempt at hand-optimising it on the bus).
Only a dozen or two of these random files were actually fed through the equivalence testing CGI, and their sizes were relatively small because of limitations of that tester.
We also had a small Erlang program that did the appropriate HTTP POST to the validation CGI, but after a couple of misleading results due to bad parsing we stopped using it.
I believe that our program runs in linear time and space and should have no trouble with either on the contest machine (not meaning to jinx anything :-)).
It's also a comfort that the sh script will mask any
programming errors, since if we failed visibly on a type error we
could never show our faces in comp.lang.functional again ;-)