Tuesday, April 22, 2008

Syntactic Sugars - What makes them sweet ?

Stephen Colebourne talks about implementing for-each in Java maps. He has proposed changes to be made to javac and queued up his request for approval by appropriate authorities. It is good to see Java community leads taking some serious steps towards syntactic sugars in the language. I am always for intention revealing syntactic sugars - after all .. Programs must be written for people to read, and only incidentally for machines to execute ?

Syntactic sugars, when properly designed, reduce the semantic distance between the problem domain and the solution domain. Syntactic sugars do not add new features or capabilities to an existing language. Still we value them mainly for social reasons - they can make your abstractions much more explicit, thereby making your intentions much more direct. And syntactic sugars often lead to concise and succinct code much pleasing to your eyes.

Java is not a language that boasts of concise syntax. Yet the smart for loop introduced in Java 5 reduces a lot of accidental complexity and makes the programmer intention much more explicit ..

for(String name : names) {
  // process name

is much more succinct than

for(Iterator<String> it = names.iterator(); it.hasNext(); ) {
  String name = it.next();
  // process name

judging from the fact that the latter snippet has its intentions buried down into verbosity of structures not directly related to the intention of the programmer.

names foreach println

is better, though not Java.

Many of the languages being used today offer lots of syntactic sugars abstracting rich capabilities of the underlying language. Take, for example, the Groovy Builder syntax, which exploits the mechanics of meta-programming, closures and named arguments to implement elegant, concise, intuitive APIs. Java developers use binding frameworks to manipulate XML and bind them to the model or the relational database schema. Not that there is anything wrong with it. But the developer has to go through all the hoops of mapping the object structure to the XML schema and use an external framework like JAXB to come up with a much longer version of the same solution than using Groovy MarkupBuilders.

Syntactic sugars are nothing new in the landscape of programming languages. It all started (possibly) with Lisp, offering macros as the means to design syntactic abstractions. To get a little sugar to the language offered syntax, you need not have to wait till the next official release. In Lisp, the syntax of the program is a direct representation of the AST, and with macros you can manipulate the parse tree directly. Languages like Lisp are known to offer syntax extensibility and allows developers to implement his own syntactic sugar.

Ruby offers runtime meta-programming, another technique to add your own syntactic sugars. Ruby does not have a macro system where you can play around with the abstract syntax tree, though we have had a ruby parser released by Ryan Davis that has been written entirely in Ruby. The standard meta object protocol offered by Ruby allows developer control over the language semantics (not the syntax) and has the capability to generate classes and methods dynamically at runtime. Meta-programming, method_missing, open classes, optional parentheses are some of the features that make Ruby a great language to build syntax abstractions for runtime processing.

A language built on the philosophy of bottom up programming offers extensible syntax (be it through the syntactic abstractions of Lisp or the semantic customizations of Ruby), on which syntactic sugars can be constructed by developers. Java believes in democratization of all syntax offered by the language, and it may take quite a few years to officialize the little sugar that you have been yarning for. Remember the explosion in the number of blog posts in celebration of the for-each loop when it came out with Java 5. In other languages, people build new syntax by the day and evolve new vocabulary within the same language that maps into the domain that they model. If you miss those features which you enjoyed in your earlier language, just build it over the new language. And it does not necessarily have to be the process of hooking onto the compilation cycle or plugging in customized modules into your language parsers.

Many of today's languages offer strong enough capabilities to build structures that look like syntax extensions. Scala is an example that makes the cut in this category. The advanced type system of Scala enables developers write control structures within the syntax of the language that looks like syntactic abstractions. Max likes deterministic finalization in C# and its idiomatic usage with "using" keyword. He has implemented the same syntax in Scala using closures, view bounds and implicit conversions. Besides eliminating lots of boilerplates, his extension looks charmingly useful for the domain he is using it.

Syntax extensibility is a necessity if you want your language to support evolution of DSLs. Extensible syntax scales much better than the framework based approach so popularized by Java. When you add a new syntactic structure in your language, it meshes so nicely with the rest of the language constructs that you never feel that it has been externally bolted on to you. Although in reality it is nothing more than syntactic sugars being presented in a form that makes more sense when coding for the particular problem in hand. When we talk about a language, we think in terms of parsers - this is no different when we think about DSLs. Implementing an external DSL is hard, considering the enormous complexity that parser generators make you go through. Scala offers monadic parser combinators where you can directly map your EBNF syntactic structures into implementations. And all this is done through syntactic sugars on top of closures and higher order functions that the language offers.

Higher Order Functions - The Secret Sauce ?

There has been lots of debates on whether object-oriented interfaces scale better than syntax extension capabilities in a language design. While OO certainly has its place in modularizing components and abstracting away relationships between them, there are situations when objects force us fit the round peg in a square hole. How many times have you cursed Java for forcing you define an unnecessary interface just to apply a function over a set of abstractions defining a specific set of contracts ? You can do the same in Scala using structural typing (aka anonymous types) and higher order functions. Higher order functions seem to be the secret sauce for offering syntax extensibility in programming languages.


Anonymous said...

It's all around
"Syntactic sugars, when properly designed, reduce the semantic distance between the problem domain and the solution domain."

The other can be easily excluded.

Anonymous said...

I tend to agree that most of the time, the ability to add syntactic sugar is useful. In order of power, we've got things like C's macros, functional constructs that look like syntax (like Scala), syntax so flexible that it obscures the boundary between idiom and DSL (Ruby), and at the top you've got true macro systems like Lisp.

But although this is OFTEN helpful, there is also a dark side to adding syntactic sugar. It is similar to operator overloading in C++: useful for creating complex numbers or smart pointers, but I've seen SQL libraries that abused it to the point of incomprehensibility. The danger of adding new syntactic constructions is that they are familiar only to someone who has seen them before -- typically someone already familiar with that body of code. Used sparingly, they can make code more readable, but overused they can make code readable *only* to someone with deep background knowledge. And readability is one of the most important properties.

So I like introducing new syntactic constructs, but I always think twice before doing so, and only take the plunge if I think it would be more readable EVEN to a reviewer who had to interrupt their reading to go look up the construct. Compressing 30 lines to 1 is a clear win, especially if repeated in many places; compressing 2 lines to 1 is unlikely to be worth the cognitive load.