Ruminations of a Programmer: April 2007

Monday, April 23, 2007

Executable XML (aka Lisp)

In a project that I have been working on for quite some time, the back office system receives XML messages from the front and middle office systems for processing. It is a securities trading and settlement system for one of the big financial houses of the world - typical messages are trades, settlements, position etc. which reach the back-office after the trade is made. Like any sane architect we have designed the system based on the Java EE stack (nowadays you never get fired for choosing Java ..) centered around a message oriented middleware transporting XML messages with gay abandon. The system has gone live for many implementations and has been delivering satisfactory throughput all over.

No complaints whatsoever, on the architecture, on the Java EE backbone, on the multitudes of XML machinery that goes behind the engineering harness of the millions of messages generated every day. If I were to architect the same system today (the existing one had been architected 3 years back), I, possibly would have gone for a similar stack, just for the sheer stability and robustness of the XML based technology and the plethora of toolset that XML offers today.

Why am I writing this blog then ?

Possibly I have been having extra caffeine of late, which has been taking away most of my sleep at night. It is 1 AM in the morning and I am still glued to two of my newest possessions in my bookshelf.

I had read some parts of SICP long back - rereading it now is like a rediscovery of many of the Aha! moments that I had last time and of course, lots of ruminations and discoveries this time as well. Based on the new found lights of Lispy (and s-expy) knowledge, I hope that some day I will be able to infuse my today's dreams and rumblings into a real life architecture. I am not sure if we will ever reach the stage of human evolution when Lisp and Scheme will be considered the bricks and mortar of enterprise architecture. Till then they will exist as the sexy counterparts of Java and C++, and will continue to allure all developers who have once committed the sin of being there and done that.

The XML Message - Today's Brick and Mortar

Here is how a sample trade message (simplified for clarity) looks in our system :

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE trd01 SYSTEM "trd01.dtd">
<trd01>
  <id>10234</id>
  <trade_type>equity</trade_type>
  <trade_date>2005-02-21T18:57:39</trade_date>
  <instrument>IBM</instrument>
  <value>10000</value>
  <trade_ccy>usd</trade_ccy>
  <settlement_info>
    <settle_date>2005-02-21T18:57:39</settle_date>
    <settle_ccy>usd</settle_ccy>
  </settlement_info>
</trd01>

We use XML parsers to parse this message, use all sorts of XPath expressions, XQuery and XSLT transformations to do all processing, querying and tearing apart the hierarchical structures that embody an XML message. The above XML message is *data* and we have tonnes of Java code processing the XML data, transforming them into business logic and persisting them in the database. So, we have the *code* (in Java) strictly separated from the *data* (in XML) using a small toolbox comprising of a bundle of XML parsers, XPath expressions, XSLT transformations, all packaged in a couple of dozens of third party jars (aka frameworks). The entire exercise is meant to make these *data* executable.

Executable Data - aka Lisp

Why not use a power that allows you to execute your data directly instead of creating unnecessary abstraction barriers in the name of OOP ? Steve Yeggey summarises it with elan :

The whole nasty "configuration" problem becomes incredibly more convenient in the Lisp world. No more stanza files, apache-config, .properties files, XML configuration files, Makefiles — all those lame, crappy, half-language creatures that you wish were executable, or at least loaded directly into your program without specialized processing. I know, I know — everyone raves about the power of separating your code and your data. That's because they're using languages that simply can't do a good job of representing data as code. But it's what you really want, or all the creepy half-languages wouldn't all evolve towards being Turing-complete, would they?

In Lisp, I can make the above data much more readable, clearer in intent, easier for the eyes and at the same time make it executable ..

(trd01
  (id 10234)
  (trade_type "equity")
  (trade_date "2005-02-21T18:57:39")
  (instrument "IBM")
  (value 10000)
  (trade_ccy "usd")
  (settle_info
    (settle_date "2005-02-24T18:57:39")
    (settle_ccy "usd"))))

Lisp is a language which was intended to be small with *no* syntax, where you have the power of macros to create your own syntax and roll it out into the language. I made each of the above tags separate Scheme functions (Oh! I was using Scheme btw), each one of which is capable of transforming itself into the desired functionality. As a result, the above data is also my code and directly executes. Another of those Aha! moments.

But my entire system is based on Java! Surely you are not telling me to change all of the guts to Scheme - are you ? My job will be at stake and I will never be able to convince my pointy haired boss that the trading back office system is running on Lisp. In fact, many people who dare to use Lisp in daytime projects are often careful to keep this a secret in the industry. And unless you have PG on your company board or blessed enough to get the favor of Y, this is a very useful tip.

Enter SISC

SISC is a lightweight, platform independent Scheme system targetting the Java Virtual Machine. It comes as a lightweight distribution (the core jar is 233 KB) and offers Scheme as a scripting language for Java. In SISC bridging is accomplished by a Java API for executing Scheme code and evaluating Scheme expressions, and a module that provides Scheme-level access to Java objects and implementation of Java interfaces in Scheme.

I can write Scheme modules and load it using Java api from my Java code, once I have bootstrapped the SISC runtime. Here is how I can initialize SISC from my Java application so as to enable my application use the Scheme functions :

// bootstrapping the SISC runtime
SeekableInputStream heap = new MemoryRandomAccessInputStream(
    getClass().getResourceAsStream("/sisc.shp"));
AppContext ctx = new AppContext();
ctx.addHeap(heap);
Interpreter interpreter = Context.enter(ctx);

and then I can use on the fly evaluation of Scheme functions as follows :

interpreter.eval("(load \"trd01.scm\")");
String s = interpreter.eval("(trd01 (id 10234) (trade_type "equity") ...)").toString();

There are quite a few variants of eval() that SISC offers, along with multiple modes of execution of Scheme code from the Java environment. For details have a look at their documentation. SISC also offers calling Java code from Scheme accessing Java classes through the extensible type system of SISC.

I do not dream about using SISC in any production code in the near foreseeable future. But just wanted to share my rants with all of you. In today's world, it is really raining programming languages - all scripting languages like Ruby, Python, JRuby, Groovy etc. are making inroads as the preferred glue language of today's enterprise architecture. But Lisp stands out as a solid robust language, with the exceptionally powerful code-as-data paradigm - I always felt Lisp was way ahead of its time. Possibly this is the time when Lisp needs to be reincarnated, all incompatibilities should be rubbed off the numerous versions and dialects of Lisp. Lisp code is executable data - it makes perfect sense to replace all frameworks that execute reams of code to process hierarchical structures as data by a single language.

Monday, April 16, 2007

Competition is healthy! Prototype Spring Bean Creation: Faster than ever before ..

In my last post I had mentioned about some performance benchmarks of Guice and Spring. In one of the applications which I had ported from Spring to Guice, I had an instance of a lookup-method injection, where a singleton service bean contained a prototype bean that needed to be looked up from the context. Here is the sample configuration XML :

<bean id="trade"
  class="org.dg.misc.Trade"
  scope="prototype">
</bean>

<bean id="abstractTradingService"
  class="org.dg.misc.AbstractTradingService"
  lazy-init="true">
  <lookup-method name="getTrade" bean="trade"/>
</bean>

I ran a performance benchmark suite to exercise 10,000 gets of the prototype bean :

BeanFactory factory = new XmlBeanFactory(
    new ClassPathResource("trade_context.xml"));
  
ITradingService ts = (ITradingService) factory.getBean(
    "abstractTradingService");
StopWatch stopWatch = new StopWatch();
stopWatch.start("lookupDemo");
    
for (int x = 0; x < 10000; x++) {
  ITrade trade = ts.getTrade();
  trade.calculateValue(null, null);
}
stopWatch.stop();
    
System.out.println("10000 gets took " + 
    stopWatch.getTotalTimeMillis() + " ms");

Spring 2.0.2 reported a timing of 359 milliseconds for the 10,000 gets. I performed the same exercise in Guice with a similar configuration :

public class TradeModule extends AbstractModule {
  @Override
  protected void configure() {
    bind(ITrade.class).to(Trade.class);
    bind(ITradingService.class).to(TradingService.class).in(Scopes.SINGLETON);
  }
}

and the corresponding test harness :

Injector injector = Guice.createInjector(new TradeModule());
  
long start = System.currentTimeMillis();
for(int i=0; i < 10000; ++i) {
  injector.getInstance(ITradingService.class).doTrade(injector.getInstance(ITrade.class));
}
long stop = System.currentTimeMillis();
System.out.println("10000 gets took " + (stop - start) + " ms");

For this exercise of 10,000 gets, Guice reported a timing of staggering 31 milliseconds.

Then a couple of days back Juergen Hoeller posted in the release news for Spring 2.0.4, that repeated creation of prototype bean instances has improved up to 12 times faster in this release. I decided to run the benchmark once again after a drop-in replacement of 2.0.2 jars by 2.0.4 ones. And voila ! Indeed there is a significant improvement in the figures. The same test harness now takes 109 milliseconds on the 2.0.4 jars. Looking at the changelog, you will notice several lineitems that have been addressed as part of improving bean instantiation timings.

This is what competition does even for the best .. Keep it up Spring guys ! Spring rocks !

Updated: Have a look at the comments by Bob and the followups for some more staggering benchmark results.

Monday, April 09, 2007

Guiced! Experience Porting a Spring Application to Guice

Yes, I got one of my Spring-Hibernate-JPA applications ported to use Guice for Dependency Injection. The application is a medium sized one and did not contain many of the corner features which have been discussed aggressively amongst the blogebrities. But now that I have the satisfaction of porting one complete application to Guice, I must say there are truly *lots of* goodies that this crazy bob creation has in it. Some of them I had mentioned in my earlier rants on Guice, many of them were hiccups, which came up primarily because I had only been a newbie with Guice - many of my questions and confusions were clarified by the experts, in my blog comments as well as in the Guice developer's mailing list. Guice is definitely an offering to look out for in the space of IoC containers. I liked what I saw in it .. in this post I will share some of my experiences.

Disclaimer : I will only focus on issues that I faced and solved in course of my porting of the application. The application did not have many blocker features for Guice - hence I do not claim that *all* applications can be ported completely using Guice 1.0.

Really Guicy!

Before I go into the details, here are some of the guicy attributes of Guice as a Java framework ..

The Java 5 usage - I always believe that backward compatibility is not a do-all end-all in a framework evolution. At some stage u need to educate the users as well, to migrate to newer versions and use the advanced features of your framework, which will make their applications more performant and maintainable. This has been one of my complaints against Java as well. It is really heartening to see Guice designers base their engine on Java 5 and use all advanced features like metadata and generics to the fullest. This has definitely made Guice more concise, precise and DRY.

Type-safety - This is possibly the loudest slogan of Guice as a DI container. It's Java generics all the way and although you can subvert the typesystem (more on this later) and hide some of your bindings from Guice, it is more of an exception. All api s in Guice are strongly typed, hence your application remains blessed with the safety of typed injections that Guice encourages.

Concise, minimal, well-designed api set with extremely verbose and explanatory error messages.

Let's Guice it up ..

Here it is. The application has been running happily in a Spring-Hibernate-JPA architecture. I took up the porting exercise purely out of academic interest and to get a first hand feel of trying to validate Guice against a real life non trivial application. I was somewhat aware of the nuances that I needed to figure out beforehand, and I classified my injection points into the following three groups :

points that I had complete control of and where I knew I would be able to inject my annotations

services with multiple implementations being used in the same application - luckily I did not have many such instances

third party POJOs that I could not invade into

The first ones were pretty cool and I happily added @Inject with appropriate bindings in the module. The injection points became very explicit and the class became more readable as far as external dependencies were concerned.

I did not have many occurences of multiple implementations of the same service being used in the same application deployment. As far as the application is concerned, we needed different implementations for different deployments, and hence I had different modules in place for them. In one of the cases, I needed to address the problem within the same instance of the application, which I did the usual way, using annotations like the following :

bind(IBarService.class)
  .to(BarService.class)
  .in(Scopes.SINGLETON);
  
bind(IBarService.class)
  .annotatedWith(Gold.class)
  .to(SpecialBarService.class)
  .in(Scopes.SINGLETON);

Injecting into third party POJOs is one of the issues that has been debated over a lot in the various blogs and forums. Here are some of the cases and how I addressed them in my application :

Case 1: Use Provider<> along with constructor injection : I used this pattern to address POJOs which we were using as part of another component and which used constructor injection. e.g.

public class ThirdPartyBeanProvider implements Provider<ThirdPartyBean> {
 
  final private IFooService fooService;
  final private IBarService barService;
 
  @Inject
  public ThirdPartyBeanProvider(final IFooService fooService, final IBarService barService) {
    this.fooService = fooService;
    this.barService = barService;
  }

  public ThirdPartyBean get() {
    return new ThirdPartyBean(fooService, barService);
  }
}

Case 2: These beans were using setter injection and I had some cases where multiple implementations of a service where being used in the same deployment of the application. Use Provider<> along with annotations to differentiate the multiple implementations of an interface. Luckily I did not have many of these cases, otherwise it would have been a bit troublesome with annotation explosion. But, at the same time, I think there may not be lots of use cases which need this in typical applications for a single deployment. e.g.

public class AnotherThirdPartyBeanProvider implements Provider<AnotherThirdPartyBean> {

  final private IFooService fooService;
  final private @Inject @Gold IBarService barService;
 
  @Inject
  public AnotherThirdPartyBeanProvider(final IFooService fooService, final IBarService barService) {
    this.fooService = fooService;
    this.barService = barService;
  }

  public AnotherThirdPartyBean get() {
    AnotherThirdPartyBean atb = new AnotherThirdPartyBean();
    atb.setFooService(fooService);
    atb.setBarService(barService);
    return atb;
  }
}

Case 3: Here I had lots of POJOs using setter injections that needed to be handled the same way. I would have to write lots of providers, but for this dynamic gem from Kevin which I dug up in a thread in the developer's mailing list. Here the type system is a bit subverted, and Guice does not have full information of all bindings. But, hey .. for porting applications, AutowiringProvider<> gave me a great way to solve this issue. Here's straight out of the class javadoc :

A provider which injects the instances it provides using an "auto-wiring" approach, rather than requiring {@link Inject @Inject} annotations. This provider requires a Class to be specified, which is the concrete type of the objects to be provided. It must be hand-instantiated by your {@link com.google.inject.Module}, or subclassed with an injectable constructor (often simply the default constructor).

And I used it like a charm to set up the bindings of my POJOs.

Finally, here is a snapshot of a representative Module class, with actual class names changed for demonstration purposes :

public class MyModule extends AbstractModule {

  @Override
  protected void configure() {
    bind(IFooService.class)
      .to(FooService.class)
      .in(Scopes.SINGLETON);
  
    bind(IBarService.class)
      .to(BarService.class)
      .in(Scopes.SINGLETON);
  
    bind(IBarService.class)
      .annotatedWith(Gold.class)
      .to(SpecialBarService.class)
      .in(Scopes.SINGLETON);
  
    bind(ThirdPartyBean.class)
      .toProvider(ThirdPartyBeanProvider.class);
  
    bind(YetAnotherThirdPartyBean.class)
      .toProvider(new AutowiringProvider<YetAnotherThirdPartyBean>(YetAnotherThirdPartyBean.class));
  
    bind(AnotherThirdPartyBean.class)
      .toProvider(AnotherThirdPartyBeanProvider.class);
  }
}

Injecting the EntityManager

In an implementation of the Repository pattern (a la Domain Driven Design), I was using JPA with Hibernate. I had an implementation of a JpaRepository, where I was injecting an EntityManager through the annotation @PersistenceContext. This was working with normal Java EE application servers where the container injects the appropriate instance of the EntityManager.

public class JpaRepository extends RepositoryImpl {
 
  @PersistenceContext
  private EntityManager em;
  // ..
  // ..
}

Spring also supports this annotation both at field and method level if a PersistenceAnnotationBeanPostProcessor is enabled. Using Guice I had to write a Provider<> to have this same functionality implemented in my Java SE application.

@Singleton
public class EntityManagerProvider implements Provider<EntityManager> {
 
  private static final EntityManagerFactory emf =
    Persistence.createEntityManagerFactory("GuiceJpaGettingStarted");
 
  public EntityManager get() {
    return emf.createEntityManager();
  }
}

The Guice Way

Guice is opinionated .. yes, it really is. And through this porting exercise I have learnt it. It encourages some practices and adds syntactic vinegars trying to subvert the recommendations. e.g. For injection, you either annotate with @Inject or write Providers. Provider<> is a wonderful tiny abstraction and it's amazing how powerful it can get in real life applications. Use Providers to implement custom instantiation policies, multiple injections per dependency and even can have custom scopes for injecting providers. For porting applications, you can use AutowiringProvider<>, but that's not really what Guice encourages a lot.

Guicy Performance

In the Spring based version, I had been using quite a few lookup-method-injections to design singleton services that have prototype beans injected. While porting, I didn't have to do anything special in Guice, apart from specifying the appropriate scopes during binding in modules. And these prototype beans were heavily instantiated within the application. I did some benchmarking and found that Guice proved to be much more performant than Spring for these use cases. In some cases I got 10 times better performance in injection and repeated instantiation of prototype beans within singleton services. I admit that Spring has much richer support for lifecycle methods and I would not venture into the hairy territory of trying to compare Spring with Guice. But if I would have to select an IoC container just for dependency injection, Guice will definitely be up there as a strong contender.