Thursday, January 18, 2007

Syntax Extensibility, Ruby Metaprogramming and Lisp Macros

Over the last few days I have been feeling a bit Lispy. It's not that I have been immersed in Lisp programming, I still do Java for my day job and enjoy the process of staring at reams of standard object oriented api calls and the big gigantic frameworks that provide the glue code for the enterprise software. Java is still my favorite programming language, I still enjoy writing Java and have been recently working on bigger commitments to write more Java with more Spring and more Hibernate.

The only difference is that I have started reading Paul Graham's On Lisp again !

I am convinced that I will not be programming production level business applications in Lisp in the near foreseeable future. But reading Lisp makes me think differently, the moment I start writing event listeners in Java Swing, I start missing true lexical closures, I look forward to higher level functions in the language. Boilerplates irritate me much more and make me imagine how I could have modeled it better using Scheme macros. True, I have been using the best IDE and leave it to its code generation engine to generate all boilerplates, I have also put forth a bit of an MDA within my development environment that generates much of the codes from the model. I am a big fan of AOP and have been using aspects for quite some time to modularize my designs and generate write-behind logic through the magic of weaving bytecodes.

The difference, once again, is that, I have been exposed to the best code generator of all times, the one with simple uniform syntax having access to the whole language parser, that gets the piece of source code in a single uniform data structure and knows how to munch out the desired transformation in a fail-safe manner day in and day out - the Lisp macro.

Abstractions - Object Orientation versus Syntax Construction

For someone obsessed with OO paradigm, thriving on the backbones of objects, virtual functions and polymorphism, I have learnt to model abstractions in terms of objects and classes (the kingdom of nouns). I define classes on top of the Java language infrastructure, add data members as attributes, add behavior to the abstractions through methods defined within the classes that operate on the attributes and whenever need be, I invoke the methods on an instantiated class object. This is the way I have, so far, learnt to add abstraction to an application layer. Abstraction, as they say, is an artifact of the solution domain, which should ultimately bring you closer to the problem domain. We have :

Machine Language -> High Level language -> Abstractions in the Solution Domain -> Problem Domain

In case of object oriented languages like Java, the size of the language is monstrous, add to that at least a couple of gigantic frameworks, and abstractions are clear guests on top of the language layer. Lisp, in its original incarnation, was conceived as a language with very little syntax. It was designed as a programmable programming language, and developing abstractions in Lisp, not only enriches the third block above, but a significant part of the second block as well. I now get what Paul Graham has been talking about programming-bottom-up, the extensible language, build-the-language-up-toward-your-program.

Take this example :

I want to implement dolist(), which effects an operation on each member of a list. With a Lisp implementation, we can have a natural extension of the language through a macro


dolist (x '(1 2 3)) (print x) (if (evenp x) (return)))


and the moment we define the macro, it blends into the language syntax like a charm. This is abstraction through syntax construction.

And, the Java counterpart will be something like :


// ..
Collection<..> list = ... ;
CollectionUtils.dolist(list,
    new Predicate() {
      public boolean evaluate() {
        // ..
      }
    });
// ..



which provides an object oriented abstraction of the same functionality. This solution provides the necessary abstraction, but is definitely not as seamless an extension of the language as its Lisp counterpart.

Extending Extensibility with Metaprogramming

Metaprogramming is the art of writing programs which write programs. Languages which offer syntax extensibility provide the normal paths to metaprogramming. And Java is a complete zero in this regard. C offers more trouble to programmers through its whacky macros, while C++'s template metaprogramming facilities are no less hazardous than pure black magic.

Ruby offers excellent metaprogramming facilities through its eval() family of methods, the here-docs, open classes, blocks and procedures. Ruby is a language with very clean syntax, having the natural elegance of Lisp and extremely powerful metaprogramming facilities. Ruby metaprogramming capabilities have given a new dimension to the concept of api design in applications. Have a look at this example from a sample Rails application :


class Product < ActiveRecord::Base
  validates_presence_of :title, :description, :image_url
  validates_format_of :image_url,
    :with => %r{^http:.+\.(gif|jpg|png)$}i,
    :message => "must be a URL for a GIF, JPG, or PNG image"
end

class LineItem < ActiveRecord::Base
  belongs_to :product
end



It's really cool DSL made possible through syntax extension capabilities offered by Ruby. It's not much of OO that Rails exploits to offer great api s, instead it's the ability of Ruby to define new syntactic constructs through first class symbols that add to the joy of programming.

How will the above LineItem definition look in Lisp's database bindings ? Let's take this hypothetical model :


(defmodel <line_item> ()
(belongs_to <product>))



The difference with the above Rails definition is the use of macros in the Lisp version as opposed to class functions in Rails. In the Rails definition, belongs_to is a class function, which when called defines a bunch of member functions in the class LineItem. Note that this is a commonly used idiom in Ruby metaprogramming where we can define methods in the derived class right from the base class. But the main point here is that in the Lisp version, the macros are replaced in the macro expansion phase before the program runs and hence provides an obvious improvement in performance compared to its Rails counterpart.

Another great Lispy plus ..

Have a look at the following metaprogramming snippet in Ruby, incarnated using class_eval for generating the accessors in a sample bean class :


def self.property(*properties)
  properties.each do |prop|
    class_eval <<-EOS
      def #{prop} ()
        @#{prop}
      end
      def #{prop}= (val)
        @#{prop} = val
      end
    EOS
  end
end



Here the code which the metaprogram generates is embedded within Ruby here-docs as a string - eval ing on a string is not the recommended best practice in the Ruby world. These stringy codes are not treated as first class citizens, in the sense that IDEs do not respect them as code snippets and neither do the debuggers. This has been described in his usual style and detail by Steve Yeggey in this phenomenal blog post. Using define_method will make it IDE friendlier, but at the expense of readability and speed. The whacky class_eval runs much faster than the define_method version. A rough benchmark indicated that the class_eval version ran twice as fast on Ruby 1.8.5 than the one using define_method.


def self.property(*properties)
  properties.each do |prop|
    define_method(prop) {
      instance_variable_get("@#{prop}")
    }

    define_method("#{prop}=") do |value|
      instance_variable_set("@#{prop}", value)
    end
  end
end



Anyway, all these are examples of dynamic metaprogramming in Ruby since everything gets done at runtime. This is a big difference with Lisp, where the code templates are not typeless strings - they are treated as valid Lisp data structures, which the macro processor can process like normal Lisp code, since macros, in Lisp operates on the parse tree of the program. Thus code templates in Lisp are IDE friendly, debugger friendly and real first class code snippets. Many people have expressed their wish to have Lisp macros in Ruby - Ola Bini has some proposals on that as well. Whatever little I have been through Lisp, Lisp macros are really cool and a definite step forward towards providing succinct extensibility to the language through user defined syntactic control structures.

OO Abstractions or Syntax Extensions ?

Coming from an OO soaked background, I can only think in terms of OO abstractions. Ruby is, possibly the first language that has pointed me to situations when syntax extensions scale better than OO abstractions - Rails is a live killer example of this paradigm. And finally when I tried to explore the roots, the Lisp macros have really floored me with their succinctness and power. I do not have the courage to say that functional abstractions of Lisp and Ruby are more powerful than OO abstractions. Steve Yeggey has put it so subtly the natural inhibition of OO programmers towards extended syntactic constructs :

Lots of programmers, maybe even most of them, are so irrationally afraid of new syntax that they'd rather leaf through hundreds of pages of similar-looking object-oriented calls than accept one new syntactic construct.


My personal take will be to exploit all features the language has to offer. With a language like Ruby or Scala or Lisp, syntax extensibility is the natural model. While Java offers powerful OO abstractions - look at the natural difference of paradigms in modeling a Ruby on Rails application and a Spring-Hibernate application. This is one of the great eye-openers that the new dynamic languages have brought to the forefront of OO programmers - beautiful abstractions are no longer a monopoly of OO languages. Lisp tried to force this realization long back, but possibly the world was not ready for it.

7 comments:

Anonymous said...

Pretty nice article; thanks.

Syntax abstractions are far and away more powerful than any OOP abstractions because the help create an environment that looks like the problem you're trying to solve.

No matter what you do with Java you are still forced the write abominations like the Predicate example, which will always detract from code clarity.

Hopefully the syntax for Java 7 will not make the cure worse than the disease, but we shall see.

I hope that JRuby moves into the mainstream and we can get (nearly) the best of both worlds (I still *really* like macros :)

Unknown said...

I don't think we should expect too much on syntax front from Java 7. They are trying to sneak in closures, but I doubt if they will make it usable enough in a pleasing way, since implementing it effectively will imply loss of backwards compatibility, which is the last thing the Java guys want. Even talking about Java 5 features, the smart loop isn't that smart, as Steve Yeggey has pointed out in one of its blogs. I think it is true that Java needs an overhaul to get rid of the legacy syntax. I had blogged about it some time ago.

All said and done, I still do Java for a living, since it has no substitute in the enterprise scalability. And I agree that JRuby moving into the mainstream can be the best that can happen for the JVM. Using Ruby elegance with Java collections and tonnes of libraries .. I think this is the combination to look for ..

Anonymous said...

For me, Scala is the best compromise between expressiveness, performance, and real-world pragmatic potential. If performance doesn't matter, then Python/Ruby.

Unknown said...

I am also a big admirer of Scala. I have blogged extensively on some of the various features of Scala which I liked. See here, here, here and here.

Austin said...

Thanks for the interesting post.

I would note that when you switch from the Lisp "dolist" macro over to the Java Predicate example, you say "[Java counterpart of] an object oriented abstraction of the same functionality". Then you move onto metaprogramming.

The Java solution isn't inherently OO and you don't have to switch to code generation.

Another pure OO solution would be to make the object responsible for understanding a dolist message. This is done at design time in the IDE or running image. True, Java cannot support this level of flexibility, but it isn't terribly OO, just C++ done right :)

Ruby does have better support for late-binding and dynamic class definition. It can also provide a new OO iteration construct without switching to meta-programming.

Anonymous said...

Hi Debasish,

Came across your blog while browsing around…cool stuff u have going on here. Also I thought I’d tell u about something I came across, thought u might find it useful, bcoz ur in Technology…it’s this site called Myndnet…u should check it out..the link is here http://www.myndnet.com/login.jsp?referral=alpa83&channel=SY

It’s this cool place where u get paid for responding to queries…very cool stuff!! http://www.myndnet.com/login.jsp?referral=alpa83&channel=SY

Sign up n lemme know what u think…my mail id is barot.alpa@gmail.com

Cheers
Alpa

Anonymous said...

checkout http://xlr.sourceforge.net/

what do u think?