Ruminations of a Programmer: August 2010

Friday, August 27, 2010

Random thoughts on Clojure Protocols

Great languages are those that offer orthogonality in design. Stated simply it means that the language core offers a minimal set of non-overlapping ways to compose abstractions. In an earlier article A Case for Orthogonality in Design I discussed some features from languages like Haskell, C++ and Scala that help you compose higher order abstractions from smaller ones using techniques offered by those languages.

In this post I discuss the new feature in Clojure that just made its way in the recently released 1.2. I am not going into what Protocols are - there are quite a few nice articles that introduce Clojure Protocols and the associated defrecord and deftype forms. This post will be some random rants about how protocols encourage non intrusive extension of abstractions without muddling inheritance into polymorphism. I also discuss some of my realizations about what protocols aren't, which I felt was equally important along with understanding what they are.

Let's start with the familiar Show type class of Haskell ..

> :t show

show :: (Show a) => a -> String

Takes a type and renders a string for it. You get show for your class if you have implemented it as an instance of the Show type class. The Show type class extends your abstraction transparently through an additional behavior set. We can do the same thing using protocols in Clojure ..

(defprotocol SHOW

  (show [val]))

The protocol definition just declares the contract without any concrete implementation in it. Under the covers it generates a Java interface which you can use in your Java code as well. But a protocol is not an interface.

Adding behaviors non-invasively ..

I can extend an existing type with the behaviors of this protocol. And for this I need not have the source code for the type. This is one of the benefits that ad hoc polymorphism of type classes offers - type classes (and Clojure protocols) are open. Note how this is in contrast to the compile time coupling of Java interface and inheritance.

Extending java.lang.Integer with SHOW ..

(extend-type Integer

  SHOW

  (show [i] (.toString i)))

We can extend an interface also. And get access to the added behavior from *any* of its implementations .. Here's extending clojure.lang.IPersistentVector ..

(extend-type clojure.lang.IPersistentVector

  SHOW

  (show [v] (.toString v)))



(show [12 1 4 15 2 4 67])

> "[12 1 4 15 2 4 67]"

And of course I can extend my own abstractions with the new behavior ..

(defrecord Name [last first])



(defn name-desc [name]

  (str (:last name) " " (:first name)))



(name-desc (Name. "ghosh" "debasish")) ;; "ghosh debasish"



(extend-type Name

  SHOW

  (show [n]

    (name-desc n)))



(show (Name. "ghosh" "debasish")) ;; "ghosh debasish"

No Inheritance

Protocols help you wire abstractions that are in no way related to each other. And it does this non-invasively. An object conforms to a protocol only if it implements the contract. As I mentioned before, there's no notion of hierarchy or inheritance related to this form of polymorphism.

No object bloat, no monkey patching

And there's no object bloat going on here. You can invoke show on any abstraction for which you implement the protocol, but show is never added as a method on that object. As an example try the following after implementing SHOW for Integer ..

(filter #(= "show" (.getName %)) (.getMethods Integer))

will return an empty list. Hence there is no scope of *accidentally* overriding some one else's monkey patch on some shared class.

Not really a type class

Clojure protocols dispatch on the first argument of the methods. This limits its ability from getting the full power that Haskell / Scala type classes offer. Consider the counterpart of Show in Haskell, which is the Read type class ..

> :t read

read :: (Read a) => String -> a

If your abstraction implements Read, then the exact instance of the method invoked will depend on the return type. e.g.

> [1,2,3] ++ read "[4,5,6]"

=> [1,2,3,4,5,6]

The specific instance of read that returns a list of integers is automatically invoked here. Haskell maintains the dispatch match as part of its global dictionary.

We cannot do this in Clojure protocols, since it's unable to dispatch based on the return type. Protocols dispatch only on the first argument of the function.

Tuesday, August 10, 2010

Using generalized type constraints - How to remove code with Scala 2.8

I love removing code. More I remove lesser is the surface area for bugs to bite. Just now I removed a bunch of classes, made unnecessary by Scala 2.8.0 type system.

Consider this set of abstractions, elided for demonstration purposes ..

trait Instrument



// equity

case class Equity(name: String) extends Instrument



// fixed income

abstract class FI(name: String) extends Instrument

case class DiscountBond(name: String, discount: Int) extends FI(name)

case class CouponBond(name: String, coupon: Int) extends FI(name)

Well, it's the instrument hierarchy (simplified) that gets traded in a securities exchange everyday. Now we model a security trade that exchanges instruments and currencies ..

class Trade[I <: Instrument](id: Int, account: String, instrument: I) {

  //..

  def calculateNetValue(..) = //..

  def calculateValueDate(..) = //..

  //..

}

In real life a trade will have lots and lots of attributes. But here we don't need them, since our only purpose here is to demonstrate how we can throw away some piece of code :)

Trade can have lots of methods which model the domain logic of the trading process, calculating the net amount of the trade, the value date of the trade etc. Note all of these are valid processes for every type of instrument.

Consider one usecase that calculates the accrued interest of a trade. The difference with other methods is that accrued interest is only applicable for Coupon Bonds, which, according to the above hierarchy is a subtype of FI. How do we express this constraint in the above Trade abstraction ? What we need is to constrain the instrument in the method.

My initial implementation was to make the AccruedInterestCalculator a separate class parameterized with the Trade of the appropriate type of instrument ..

class AccruedInterestCalculator[T <: Trade[CouponBond]](trade: T) {

  def accruedInterest(convention: String) = //.. impl

}

and use it as follows ..

val cb = CouponBond("IBM", 10)

val trd = new Trade(1, "account-1", cb)

new AccruedInterestCalculator(trd).accruedInterest("30U/360")

Enter Scala 2.8 and the generalized type constraints ..

Before Scala 2.8, we could not specialize the Instrument type I for any specific method within Trade beyond what was specified as the constraint in defining the Trade class. Since calculation of accrued interest is only valid for coupon bonds, we could only achieve the desired effect by having a separate abstraction as above. Or we could take recourse to runtime checks.

Scala 2.8 introduces generalized type constraints which allow you to do exactly this. We have 3 variants as:

A =:= B, which mandates that A and B should exactly match
A <:< B, which mandates that A must conform to B
A A <%< B, which means that A must be viewable as B

Predef.scala contains these definitions. Note that unlike <: or >:, the generalized type constraints are not operators. They are classes, instances of which are implicitly provided by the compiler itself to enforce conformance to the type constraints. Here's an example for our use case ..

class Trade[I <: Instrument](id: Int, account: String, instrument: I) {

  //..

  def accruedInterest(convention: String)(implicit ev: I =:= CouponBond): Int = {

    //..

  }

}

ev is the type class which the compiler provides that ensures that we invoke accruedInterest only for CouponBond trades. You can now do ..

val cb = CouponBond("IBM", 10)

val trd = new Trade(1, "account-1", cb)

trd.accruedInterest("30U/360")

while the compiler will complain with an equity trade ..

val eq = Equity("GOOG")

val trd = new Trade(2, "account-1", eq)

trd.accruedInterest("30U/360")

Now I can throw away my AccruedInterestCalculator class and all associated machinery. A simple type constraint tells us a lot and models domain constraints, and all that too at compile time. Yum!

You can also use the other variants to great effect when modeling your domain logic. Suppose you have a method that can be invoked only for all FI instruments, you can express the constraint succinctly using <:< ..

class Trade[I <: Instrument](id: Int, account: String, instrument: I) {

  //..

  def validateInstrumentNotMatured(implicit ev: I <:< FI): Boolean = {

    //..

  }

}

This post is not about discussing all capabilities of generalized type constraints in Scala. Have a look at these two threads on StackOverflow and this informative gist by Jason Zaugg (@retronym on Twitter) for all the details. I just showed you how I removed some of my code to model my real world domain logic in a more succinct way that also fails fast during compile time.

Update: In response to the comments regarding Strategy implementation ..

Strategy makes a great use case when you want to have multiple implementations of an algorithm. In my case there was no variation. Initially I kept it as a separate abstraction because I was not able to constrain the instrument type in the accruedInterest method whole being within the trade class. Calculating accruedInterest is a normal domain operation for a CouponBond trade - hence trade.accruedInterest(..) looks to be a natural API for the context.

Now let us consider the case when the calculation strategy can vary. We can very well extract the variable part from the core implementation and model it as a separate strategy abstraction. In our case, say the calculation of accrued interest will depend on principal of the trade and the trade date (again, elided for simplicity of demonstration) .. hence we can have the following contract and one sample implementation:

trait CalculationStrategy {

  def calculate(principal: Int, tradeDate: java.util.Date): Int

}



case class DefaultImplementation(name: String) extends CalculationStrategy {

  def calculate(principal: Int, tradeDate: java.util.Date) = {

    //.. impl

  }

}

But how do we use it within the core API that the Trade class publishes ? Type Classes to the rescue (once agian!) ..

class Trade[I <: Instrument](id: Int, account: String, instrument: I) {

  //..

  def accruedInterest(convention: String)(implicit ev: I =:= CouponBond, strategy: CalculationStrategy): Int = {

    //..

  }

}

and we can now use the type classes using our own specific implementation ..

implicit val strategy = DefaultImplementation("default")

  

val cb = CouponBond("IBM", 10)

val trd = new Trade(1, "account-1", cb)

trd.accruedInterest("30U/360")  // uses the default type class for the strategy

Now we have the best of both worlds. We implement the domain constraint on instrument using the generalized type constraints and use type classes to make the calculation strategy flexible.

Monday, August 09, 2010

Updates on DSLs In Action - Into Copy Editing

I have completed writing DSLs In Action. As we speak, the book has moved from the development editor to the copy editor. I will be starting the process of copy editing along with the team of helpful copy editors of Manning.

The Table of Contents has been finalized. Have a look at the details and send me your feedbacks regarding the contents of the book.

DSLs In Action is a book for the practitioner. It contains real world experience of writing DSLs in a multitude of JVM languages. As the table of contents show, I have used Java, Groovy, Ruby, Scala and Clojure to demonstrate their power in DSL design and implementation. I have also focused on the integration aspects between these languages, which is fashionably known today by the name of polyglot programming.

All examples in the book are from the real world domain of securities trading and brokerage systems. I have intentionally chosen a specific domain to demonstrate the progression of DSL implementation from small trivial examples to serious complex and non-trivial ones. This also goes to bust a common myth that DSLs are applicable only for toy examples.

Another recurring theme throughout the book has been a strong focus on abstraction design. Designing good DSLs in an exercise in making well-designed abstractions. A DSL is really a thin linguistic abstraction on top of the semantic model of the domain. If the underlying model is expressive enough and publishes well behaved abstractions, then designing a user friendly syntax on top of it becomes easy. The book discusses lots of tools and techniques that will help you think in terms of designing expressive DSLs.

The book is replete with code written in multiple languages. You can get it all by cloning my github repo which contains maven based instructions to try most of them yourself.

And finally, thanks to all the reviewers for the great feedback received so far. They have contributed a lot towards improvement of the book, all remaining mistakes are mine.