Ruminations of a Programmer: November 2006

Monday, November 27, 2006

Threadless Concurrency on the JVM - aka Scala Actors

Billy has written an interesting blog on the impact of multicore processors on Java. He concludes that the Java EE platform will have to be redressed to some extent in order to address the new threading patterns that applications will use and the consequences of reduced clock speed to accomodate the extra cores on the die. He has made some very thoughtful observations regarding the evolution of the future Java EE platforms and the JVM. Definitely worth a couple of reads ..

Concurrency Concurrency

One of the commonly mentioned fallouts of the new processor architectures is the new face of the applications written on the JVM. In order to take performance advantage from the multiple cores, applications need to be more concurrent, programmers need to find more parallelism within the application domain. Herb Sutter sums it up nicely in this landmark article :

But if you want your application to benefit from the continued exponential throughput advances in new processors, it will need to be a well-written concurrent (usually multithreaded) application. And that’s easier said than done, because not all problems are inherently parallelizable and because concurrent programming is hard.

Look Maa ! No Threads !

Writing multi-threaded code is hard, and, as the experts say, the best way to deal with multi-threading is to avoid it. The two dominant paradigms of concurrency available in modern day languages are :

Shared State with Monitors, where concurrency is achieved through multiple threads of execution synchronized using locks, barriers, latches etc.

Message Passing, which is a shared-nothing model using asynchronous messaging across lightweight processes or threads.

The second form of concurrent programming offers a higher level of abstraction where the user does not have to interact directly with the lower level primitives of thread models. Erlang supports this model of programming and has been used extensively in the telecommunications domain to achieve a great degree of parallelism. Java supports the first model, much to the horror of many experts of the domain and unless you are Brian Goetze or Doug Lea, designing concurrent applications in Java is hard.

Actors on the JVM

Actor based concurrency in Erlang is highly scalable and offers a coarser level of programing model to the developers. Have a look at this presentation by Joe Armstrong which illustrates how the share-nothing model, lightweight processes and asynchronous messaging support makes Erlang a truly Concurrency Oriented Programming Language. The presentation also gives us some interesting figures - an Erlang based Web server supported more than 80,000 sessions while Apache crashed at around 4,000.

The new kid on the block, Scala brings Erlang style actor based concurrency on the JVM. Developers can now design scalable concurrent applications on the JVM using the actor model of Scala which will automatically take advantage of the multicore processors, without programming to the complicated thread model of Java. In applications which demand large number of concurrent processes over a limited amount of memory, threads of the JVM, prove to be of significant footprint because of stack maintenance overhead and locking contentions. Scala actors provide an ideal model for programming in the non-cooperative virtual machine environment. Coupled with the pattern matching capabilities of the Scala language, we can have the full power of Erlang style concurrency on the JVM. The following example is from this recent paper by Philipp Haller and Martin Odersky:

class Counter extends Actor {
  override def run(): unit = loop(0)

  def loop(value: int): unit = {
    Console.println("Value: " + value)
    receive {
      case Incr() => loop(value + 1)
      case Value(a) => a ! value; loop(value)
      case Lock(a) => a ! value
        receive { case UnLock(v) => loop(v) }
      case _ => loop(value)
    }
  }
}

and its typical usage also from the same paper :

val counter = new Counter // create a counter actor
counter.start() // start the actor
counter ! Incr() // increment the value by sending the Incr() message
counter ! Value(this) // ask for the value

// and get it printed by waiting on receive
receive { case cvalue => Console.println(cvalue) }

Scala Actors

In Scala, actors come in two flavors -

Thread based actors, that offer a higher-level abstraction of threads, which replace error-prone shared memory accesses and locks by asynchronous message passing and

Event based actors, which are threadless and hence offer the enormous scalability that we get in Erlang based actors.

As the paper indicates, event based actors offer phenomenal scalability when benchmarked against thread based actors and thread based concurrency implementations in Java. The paper also demonstrates some of the cool features of library based design of concurrency abstractions in the sense that Scala contains no language support for concurrency beyond the standard thread model offered by the host environment.

I have been playing around with Scala for quite some time and have been thoroughly enjoying the innovations that the language offers. Actor based concurrency model is a definite addition to this list, more so since it promises to be a great feature that programmers would love to have as part of their toolbox while implementing on the JVM. JVM is where the future is, and event based actors in Scala will definitely be one of the things to watch out for ..

Monday, November 20, 2006

Use Development Aspects to Enforce Concurrency Idioms in Java Applications

Java 5 has given us a killer concurrency library in java.util.concurrent. Mustang will add more ammunitions to the already performant landscape - now it is upto the developers to use it effectively for the best yield. If you are not Doug Lea and have been getting your hands dirty with the Executors and Latches and Barriers of java.util.concurrent, I am sure you must have realized that killer libraries also need quality programmers to deliver the good. I have been thinking of ways get some of the concurrency goodies into existing Java applications, who have recently migrated to the Java 5 platform and have still been struggling with performance problems on multithreaded programs. Of course for production environments, we cannot redesign things from scratch, however ingenuous solution we promise to deliver. However, in one of these recently migrated applications, it was one of my charters to do a code review and suggest a path of least resistance that can potentially introduce some of the improvements of java.util.concurrent without any major redesign.

I decided to approach the problem by trying to find out some of the obvious concurrency related problems that Brian Goetze has been harping upon in his Java Theory and Practice columns in IBM developerworks. The codebase was huge and has evolved over the last 3 years under the auspices of a myriad of programmers - hence I decided to equip myself with the most important crosscutting weapon, a range of development aspects that can point me to some of the possible problem areas. There may be some false positives, but overall the strategy worked .. the following post highlights some of them with some representative code snippets for illustration purposes.

Safe Publication of Objects #1

Do not allow the this reference to escape during construction. One typical scenario where programmers make mistake is to start a thread from within a constructor. And as Brian Goetze has pointed out, it is a definite anti-pattern. I wrote a small aspect for detecting this :

public aspect TrackEscapeWithThreadStartInConstructor {
  pointcut inConstructor() : initialization(org.dg.biz.*.new(..));
  pointcut threadStart() : call(void Thread.start()); 
 
  before() : cflow(inConstructor()) && threadStart() {
    throw new RuntimeException(
      "possible escape of this through thread start in constructor");
  } 
}

and Eclipse was quick to point me to the offending classes :

Safe Publication of Objects #2

Another anti-pattern when the this reference escapes from the constructor is when the programmer calls a non-static instance method from within the constructor of a class. Here is a small aspect that catches this anti-pattern :

public aspect TrackEscapeWithMethodCallInConstructor {
  pointcut inConstructor(Object o) 
   : target(o) 
    && withincode(org.dg.biz.*.new(..));

  pointcut callInstanceMethod() 
   : call(!private !static * org.dg.biz.*.*(..)); 

  before(Object o) : inConstructor(o) 
    && callInstanceMethod() 
    && if (o.equals(thisJoinPoint.getTarget())) {
 
      throw new RuntimeException(
      "possible escape of this through instance method call in constructor");
  }
}

and Eclipse responds :

Using the New Concurrent Collections

Java 5 and Java 6 offer concurrent collection classes as definite improvements over the synchronized collections and can be used as drop in replacements in most of the cases. The older synchronized collection classes serialize all access to the collection, resulting in poor concurrency. The new ones (ConcurrentHashMap, CopyOnWriteArrayList etc.) offer better concurrency through locking at a finer level of granularity. If the traversal in the concurrent collections is the dominant operation, then these new collection classes can offer dramatic scalability improvements with little risk - for more information refer to Brian Goetze's excellent book on Java Concurrency in Practice.

I wrote the following aspect which pointed me to all possible uses of the synchronized collections in the codebase :

public aspect TrackSynchronizedCollection {
  pointcut usingSyncCollection() : 
   call(public * java.util.Collections.synchronized*(..));
 
  declare warning 
    : usingSyncCollection()
    : "consider replacing with Concurrent collections of Java 5"; 
}

We did a careful review and replaced many of those occurences with the newer collections - and we were able to achieve significant performance gains in some situations.

Handling the InterruptedException

In many places within the application that deal with blocking apis and multithreading, I found empty catch blocks as handlers of InterruptedException. This is not recommended as it deprives code higher up on the call stack of the opportunity to act on the interruption - once again refer to Brian Goetze for details.

However, a small aspect allows to get to these offending points :

public aspect TrackEmptyInterruptedExceptionHandler {
  pointcut inInterruptedExceptionHandler() 
   : handler(InterruptedException+);

  declare warning
    : inInterruptedExceptionHandler()
    : "InterruptedException handler policy may not be defined";
 
  before() : inInterruptedExceptionHandler() {
    Thread.currentThread().interrupt();
  }
}

Development aspects are a great tool for refactoring and code review. I realized this first hand in the exercise that I did above and succeeded in identifying some of the anti-patterns of writing multi-threaded programs in Java and enforcing some of the common idioms and best practices across the codebase. In the above aspects, I have only scratched the surface - in fact developing a library of development aspects will be a great tool to the developer at large.

Monday, November 13, 2006

RIA, Echo2 and Programming Model

We, at Anshinsoft, have been working on our offering in the Reporting Solutions space - a WYSIWYG Report Designer and a full blown Enterprise Report Server. The Designer is a desktop Swing based application for designing reports, which, then can be deployed, managed and scheduled in the Enterprise Server.

As an additional plug-in, we would also like to have the Designer on the Web using the in-fashion RIA and Ajax architectural styles, which will enable users the usual flexibilities of a thin client application along with the richness that adds on to it. I have been exploring some of the architectural options towards this end, keeping in mind some of the constraints that we, as an organization have :

The current Designer has been implemented in Swing - we have a large programmer base who have delivered Java Swing based UIs and are familiar with the Swing programming model.

We do not have many Javascript programmers and as an organization are not equipped well enough to take up the seemingly daunting task of crafting a Javascript based architecture (aka client side Ajax)

I am comfortable with the first part of the previous point that we have a large Java programmer base. But based on the tonnes of Swing code that have been churned out in the current implementation, I am very much skeptical about the maintenability and the aesthetics of the codebase. In one of my previous posts, I had expressed my disliking of the Swing based programming model, which encourages lots of boilerplate stuff. Hence, I would like to move away from the naked model of Swing programming.

Enter Echo2

Based on the above constraints, I did some research and have ultimately arrived at Echo2 as the suggested platform for implementing the Web based Designer with a rich UI. The main pluses that I considered going with Echo2 are :

Completely Java based programming model, which nicely fits into the scheme of our organization expertise fabric.

Swing like apis which again score with respect to the familiarity metrics of the team.

Echo2 nicely integrates with Spring, which we use as the business layer.

I also considered GWT, but ultimately took on Echo2, because GWT is still in beta and does not have the same richness with respect to pre-built set of available components.

Concerns

The main concern that I have with Echo2 is, once again, related to the programming model - I am not a big admirer of the naked Swing model of programming. And here is the main gotcha .. I have been thinking of the following possibilities that can give me a more improved programming model on Echo2 :

Think MDA, use Eclipse EMF and openArchitectureWare to design models of the user interfaces and generate Echo2 code out of it. Then I maintain the models, which look much more pragmatic than maintaining a huge boilerplate style of codebase.

Has someone written something similar to Groovy SwingBuilder for Echo2, which I can use as a markup.

Use some sort of Language Oriented Programming, maybe a small DSL using the JetBrains Meta Programming System(MPS).

Write a homegrown abstraction layer on top of Echo2 that incorporates FluentInterfaces like goodies and offers a better programming model.

I would really like to go for a more declarative programming model - in Echo2, the navigation and flow logic are completely embedded with the rendering part. Can I externalize it without writing lots of framework code ?

Why not WebOnSwing or similar stuff ?

I do not want to deploy the existing Swing application - it has evolved over the years and it is time we move on to a higher level of abstraction and capitalize on richer features and responsiveness of Ajax frameworks.

I would like to seek suggestions from experts on the above views. Any pointers, any suggestions that will help us make a positive move will be most welcome!

Monday, November 06, 2006

Domain Abstractions : Abstract Classes or Aspect-Powered Interfaces ?

In my last post I had discussed about why we should be using abstract classes instead of pure Java interfaces to model a behavior-rich domain abstraction. My logic was that rich domain abstractions have to honor various constraints, which cannot be expressed explicitly by pure interfaces in Java. Hence leaving all constraints to implementers may lead to replication of the same logic across multiple implementations - a clear violation of the DRY principle.

However, lots of people expressed their opinions through comments in my blog and an interesting discussion on the Javalobby forum, where I found many of them to be big proponents of *pure* Java interfaces. All of them view Java interfaces as contracts of the abstraction and would like to undertake the pain of extreme refactoring in order to accomodate changes in the *published* interfaces. This post takes a relook at the entire view from the world of interfaces and achieve the (almost) same effect as the one described using abstract classes.

Interfaces Alone Don't Make the Cut!

Since pure Java interfaces cannot express any behavior, it is not possible to express any constraint using interfaces alone. Enter aspects .. we can express the same behavioral constraints using aspects along with interfaces.

Continuing with the same example from the previous post :-

The interface ..

interface IValueDateCalculator {
  Date calculateValueDate(final Date tradeDate) 
      throws InvalidValueDateException;
}

and a default implementation ..

public class ValueDateCalculator implements IValueDateCalculator {
  public Date calculateValueDate(Date tradeDate)
      throws InvalidValueDateException {
    // implementation
  }
}

Use aspects to power up the contract with mandatory behavior and constraints ..

public aspect ValueDateContract {

  pointcut withinValueDateCalculator(Date tradeDate) : 
      target(IValueDateCalculator) &&
      args(tradeDate) &&
      call(Date calculateValueDate(Date));
 
  Date around(Date tradeDate) : withinValueDateCalculator(tradeDate) {

    if (tradeDate == null) {
      throw new IllegalArgumentException("Trade date cannot be null");
    }

    Date valueDate = proceed(tradeDate);

    try {
      if (valueDate == null) {
        throw new InvalidValueDateException(
          "Value date cannot be null");
      }

      if (valueDate.compareTo(tradeDate) == 0 
        || valueDate.before(tradeDate)) {
        throw new InvalidValueDateException(
          "Value date must be after trade date");
      }

    } catch(Exception e) {
      // handle
    }

    return valueDate;
  }
}

Is this Approach Developer-friendly ?

Aspects, being looked upon as an artifact with the *obliviousness* property, are best kept completely decoupled from the interfaces. Yet, the complete configuration of a contract for a module can be known only with the full knowledge of all aspects that weave together with the interface. Hence we have the experts working on how to engineer aspect-aware-interfaces as contracts for the modules of a system, yet maintaining the obliviousness property.

While extending from an abstract class, the constraints are always localized within the super class for the implementer, in this case, the aspects, which encapsulate the constraints, may not be "textually local". Hence it may be difficult for the implementer to be aware of these constraints without strong tooling support towards this .. However, Eclipse 3.2 along with ajdt shows all constraints and advices associated with the aspect :

at aspect declaration site ..

and at aspect call site ..

And also, aspects do not provide that much fine grained control over object behavior than in-situ Java classes. You can put before(), after() and around() advices, but I still think abstract classes allow a more fine grained control over assertion, invariant and behavior parameterization.

Dependencies, Side-effects and Undocumented Behaviors

In response to my last post, Cedric had expressed the concern that with the approach of modeling domain abstractions using abstract classes with mandatory behavioral constraints

Before you know it, your clients will be relying on subtle side-effects and undocumented behaviors of your code, and it will make future evolution much harder than if you had used interfaces from the start.

I personally feel that for mandatory behaviors, assertions and invariants, I would like to have all concrete implementations *depend* on the abstract class - bear in mind that *only* those constraints go to the abstract class which are globally true and must be honored for all implementations. And regarding unwanted side-effects, of course, it is not desirable and often depends on the designer.

From this point of view, the above implementation using interfaces and aspects also suffer from the same consequences - implementers depend on concrete aspects and are always susceptible to unwanted side-effects.

Would love to hear what others feel about this ..