Monday, November 05, 2007

Towards more Powerful Abstractions in the Javaland

Way back in 2000, Scott Myers suggested improving encapsulation through increased usage of non-member, non-friend functions in C++. He had this punchline as the starting point of his column :
If you're writing a function that can be implemented as either a member or as a non-friend non-member, you should prefer to implement it as a non-member function. That decision increases class encapsulation. When you think encapsulation, you should think non-member functions.

While discussing the degrees of encapsulation, he explains that the amount of encapsulation in a class is inversely proportional to the number of functions that may break if the implementation of the class has to change. And that being the case, it becomes clear that a class with n member functions is more encapsulated than a class with n+1 member functions.

I was then an avid C++ programmer and this article hit me like a bolt. This was possibly the first instance of an exposition (that I found) on OO that makes you think away from the kingdom of nouns. Keep the number of verbs hanging off the nouns to a minimum, use the C++ namespace as the module of encapsulation, so as to keep the class interfaces complete and minimal.

Shortly afterwards, Herb Sutter, in one his Guru-of-the-Week columns, Monoliths "Unstrung", dissected std::basic_string to identify the set of functions which should have been implemented as member functions instead of the current count of 103.

Very recently, Reg Braithwaite and Buko Obele discussed the inappropriateness of applying the noun/verb metaphor to object oriented programming. It is not correct to model all verbs hanging off the nouns - Obele goes on to say ..
The question of who verbs 'belong to' is always the wrong question; instead it is always a matter of how a new concept can be meaningfully introduced into our existing system.

In fact, long back, the design of the C++ Standard library was based on the concept of making the algorithms (the verbs) the first class citizens implemented generically on the containers that they operate on.

It's not the objects carrying the functions that make the OO way of modeling things - the language has to support powerful abstractions that help programmers write extensible code.

The Expression Problem

This has been one of the classical examples to demonstrate how mainstream OO languages lack abstractions for modeling an extensible solution. The problem has been described simplistically in this paper by Zenger and Odersky :
Suppose we have a datatype which is defined by a set of cases and we have processors which operate on this datatype. There are primarily two directions along which we can extend such a system:
• The extension of the datatype with new data variants,
• The addition of new processors.

The classical OO approach to solve this problem involves designing a polymorphic hierarchy of datatypes, with each concrete subclass modeling a data variant and implementing the set of processors. With this approach, extending datatypes is easy, just add a variant as another subclass. But extending processors is invasive, violates the Open/Closed principle and forces you to dig into every existing data variant. A typical fallout of embracing the noun/verb paradigm of modeling.

The alternate not-a-strictly-OO approach adopted by programmers to solve these problems is by implementing some sort of double dispatch using the Visitor design pattern. This is an example where the solution gets overtly complex and forces you to write code that simulates the run time type dispatch mechanism of the underlying programming language. The Visitor design pattern is yet another example of a workaround for the noun/verb metaphor in mainstream OO languages. The Visitor abstractions allow a physical separation of the processors from the datatype, but the individual processors very much match one-to-one with each individual datatype. In this paper (Matching Objects with Patterns) discussing the Scala language capabilities, Burak Emir et. al. states this problem with Visitors more succinctly :
The visitor design pattern causes a relatively high notational overhead for framework construction, because a visitor class has to be defined and matchWith methods have to be provided in all data variants. The pattern matching itself is disciplined but very verbose, especially for deep patterns. Visitors in their standard setting do not maintain representation independence, because case methods correspond one-to-one to data alternatives.

How can we get around this accidental complexity and make our OO code more succinct ? The answer is simple - more powerful abstractions.

Scala Pattern Matching

Scala offers pattern matching as a solution to the expression problem. Pattern matching has been one of the key features in many functional languages like Erlang - Scala has brought this feature to the OO paradigm. Martin Odersky has been quite a strong advocate of pattern matching, a feature which many purists consider orthogonal to encapsulation of abstractions. But definitely pattern matching in Scala provides a solution away from the typical noun/verb syndrome in today's OO modeling paradigms. Have a look here for a complete example of pattern matching in Scala to solve a problem which would have been much more verbose and complex using visitors.

And Java ?

Java has so long been the kingpin of the noun kingdom and there are tons of Java code that model verbs hanging off the nouns. However, with more and more functional paradigms crying out on the sidelines for their entry into the noun kingdom, things look to be changing. I have been dabbling a bit with the prototype of Closures released by Neal Gafter, and things have already started looking interesting. Closures in Java is definitely a very very welcome feature, and the community has already started battling for their introduction in Java 7. Ricky Clarkson has implemented pattern matching in Java using Neal's prototype. Though at a very early stage, it is indeed very heartening to see more powerful forms of abstractions making their way into the Java programming language. Java, as a platform, has already been enriched with languages like Scala and JRuby contributing many powerful abstractions of functional programming.

Reg has demonstrated the Equivalence relationship using the visitor pattern in a manner conforming to the tradition of verbosity in Java. With pattern matching and closures, things will improve towards more succinctness. Here is a version based on the pattern matching engine of Ricky :

class Main
  public static void main(String[] args) {
    Collection coll1 = new ArrayList<Integer>() {{ add(12); add(15); add(10); }};
    Collection coll2 = new TreeSet<Integer>() {{ add(12); add(15); add(10); }};

    // equivalence of List-List and List-Set
    System.out.println(equivalent(coll1, coll2)
        .add(List.class, List.class, {List l1, List l2 => CollectionUtils.isEqualCollection(l1, l2)})
        .add(List.class, Set.class, {List l1, Set s1 => CollectionUtils.isEqualCollection(l1, s1)})

    Map<Integer, Integer> coll3 = new HashMap<Integer, Integer>() {{
      put(1, 12);
      put(2, 15);
      put(3, 10);

    // equivalence of List-List, List-Set, List-Map
    System.out.println(equivalent(coll2, coll3)
        .add(List.class, List.class, {List l1, List l2 => CollectionUtils.isEqualCollection(l1, l2)})
        .add(List.class, Set.class, {List l1, Set s1 => CollectionUtils.isEqualCollection(l1, s1)})
        .add(List.class, Map.class, {List l1, Map m1 => CollectionUtils.isEqualCollection(l1, m1.values())})

  public static <T,R> Matcher<T,R> equivalent(T t1, T t2) {
    return new Matcher<T,R>(t1, t2);

  public static <T,R> Matcher<T,R> match(T t1, T t2) {
    return new Matcher<T,R>(t1, t2);

  public static class Matcher<T,R> {
    public final T t1;
    public final T t2;
    public R r;

    public Matcher(T t1, T t2) {
      this.t1 = t1;
      this.t2 = t2;

    public <U1 extends T, U2 extends T> Matcher<T,R> add(Class<U1> aCase, Class<U2> bCase, {U1,U2=>R} f) {
      if (aCase.isInstance(t1) && bCase.isInstance(t2))
        r=f.invoke(aCase.cast(t1), bCase.cast(t2));

      return this;

    public R done() {
      return r;

Definitely a more succinct way to model equivalence than what we are used to in the Javaland. And this is only a start. There are quite a few rough edges to smoothen out, particularly with respect to handling generics and type erasure issues. But, I am sure we will see more and more power of abstractions coming into Java with people like Neal Gafter and Doug Lea pitching in. A tonne of thanks to these thoughtleaders for still keeping the Java language an interesting platform to play around with.


Anonymous said...

I do love how Java and C# are developing in lock-step.

Anonymous said...

Why use 'car' and 'cdr' when their meaning is so extremely non-obvious? 'Head' and 'tail', anyone?

Anonymous said...