Tuesday, September 27, 2011

Non blocking composition using Redis and Futures

scala-redis now supports pooling of Redis clients. Using RedisClientPool you can do some cool stuff in non blocking mode and get an improved throughput for your application.

Suppose you have a bunch of operations that you can theoretically execute in parallel, maybe a few disjoint list operations and a few operations on key/values .. like the following snippets ..

val clients = new RedisClientPool("localhost", 6379)

// left push to a list
def lp(msgs: List[String]) = {
  clients.withClient {client => {
    msgs.foreach(client.lpush("list-l", _))
    client.llen("list-l")
  }}
}

// right push to another list
def rp(msgs: List[String]) = {
  clients.withClient {client => {
    msgs.foreach(client.rpush("list-r", _))
    client.llen("list-r")
  }}
}

// key/set operations
def set(msgs: List[String]) = {
  clients.withClient {client => {
    var i = 0
    msgs.foreach { v =>
      client.set("key-%d".format(i), v)
      i += 1
    }
    Some(1000) // some dummy
  }}
}

Redis, being single threaded, you can use client pooling to allocate multiple clients and fork these operations concurrently .. Here's a snippet that does these operations asynchronously using Scala futures ..

// generate some arbitrary values
val l = (0 until 5000).map(_.toString).toList

// prepare the list of functions to invoke
val fns = List[List[String] => Option[Int]](lp, rp, set)

// schedule the futures
val tasks = fns map (fn => scala.actors.Futures.future { fn(l) })

// wait for results
val results = tasks map (future => future.apply())

And while we are on this topic of using futures for non blocking redis operations, Twitter has a cool library finagle that offers lots of cool composition stuff on Futures and other non blocking RPC mechanisms. Over the weekend I used some of them to implement scatter/gather algorithms over Redis. I am not going into the details of what I did, but here's a sample dummy example of stuffs you can do with RedisConnectionPool and Future implementation of Finagle ..

The essential idea is to be able to compose futures and write non blocking code all the way down. This is made possible through monadic non-blocking map and flatMap operations and a host of other utility functions that use them. Here's an example ..

def collect[A](fs: Seq[Future[A]]): Future[Seq[A]] = { //..

It uses flatMap and map to collect the results from the given list of futures into a new future of Seq[A].

Let's have a look at a specific example where we push a number of elements into 100 lists concurrently using a pool of futures, backed by ExecutorService. This is the scatter phase of the algorithm. The function listPush actually does the push using a RedisConnectionPool and each of these operations is done within a Future. FuturePool gives you a Future where you can specify timeouts and exception handlers using Scala closures.

Note how we use the combinator collect for concurrent composition of the futures. The resulting future that collect returns will be complete when all the underlying futures have completed.

After the scatter phase we prepare for the gather phase by pipelining the future computation using flatMap. Unlike collect, flatMap is a combinator for sequential composition. In the following snippet, once allPushes completes, the result pipelines into the following closure that generates another Future. The whole operation completes only when we have both the futures completed. Or we have an exception in either of them.

For more details on how to use these combinators on Future abstractions, have a look at the tutorial that the Twitter guys published recently.

implicit val timer = new JavaTimer

// set up Executors
val futures = FuturePool(Executors.newFixedThreadPool(8))

// abstracting the flow with future
private[this] def flow[A](noOfRecipients: Int, opsPerClient: Int, fn: (Int, String) => A) = {
  val fs = (1 to noOfRecipients) map {i => 
    futures {
      fn(opsPerClient, "list_" + i)
    }.within(40.seconds) handle {
      case _: TimeoutException => null.asInstanceOf[A]
    }
  }
  Future.collect(fs)
}

// scatter across clients and gather them to do a sum
def scatterGatherWithList(opsPerClient: Int)(implicit clients: RedisClientPool) = {
  // scatter
  val allPushes: Future[Seq[String]] = flow(100, opsPerClient, listPush)
  val allSum = allPushes flatMap {result =>
    // gather
    val allPops: Future[Seq[Long]] = flow(100, opsPerClient, listPop)
    allPops map {members => members.sum}
  }
  allSum.apply
}

For the complete example implementations of these patterns like scatter/gather using Redis, have a look at the github repo for scala-redis.

Monday, July 25, 2011

Monad Transformers in Scala

Monads don't compose .. and hence Monad Transformers. A monad transformer maps monads to monads. It lets you transform a monad with additional computational effects. Stated simply, if you have a monadic computation in place you can enrich it incrementally with additional effects like states and errors without disturbing the whole structure of your program.

A monad transformer is represented by the kind T :: (* -> *) -> * -> *. The general contract that a monad transformer offers is ..

class MonadTrans t where
 lift :: (Monad m) => m a -> t m a

Here we lift a computation m a into the context of another effect t. We call t the monad transformer, which is itself a monad.

Well in this post, I will discuss monad transformers in scala using scalaz 7. And here's what you get as the base abstraction corresponding to the Haskell typeclass shown above ..

trait MonadTrans[F[_[_], _]] {
  def lift[G[_] : Monad, A](a: G[A]): F[G, A]
}

It takes a G and lifts it into another computation F thereby combining the effects into the composed monad. Let's look at a couple of examples in scalaz that use lift to compose a new effect into an existing one ..

// lift an Option into a List
// uses transLift available from the main pimp of scalaz

scala> List(10, 12).transLift[OptionT].runT
res12: List[Option[Int]] = List(Some(10), Some(12))

// uses explicit constructor methods

scala> optionT(List(some(12), some(50))).runT
res13: List[Option[Int]] = List(Some(12), Some(50))

If you are like me, you must have already started wondering about the practical usage of this monad transformer thingy. Really we want to see them in action in some meaningful code which we write in our daily lives.

In the paper titled Monad Transformers Step by Step, Martin Grabmuller does this thing in Haskell evolving a complete interpreter of a subset of language using these abstractions and highlighting how they contribute towards an effective functional model of your code. In this post I render some of the scala manifestations of those examples using scalaz 7 as the library of implementation. The important point that Martin also mentions in his paper is that you need to think functionally and organize your code upfront using monadic structures in order to take full advantage of incremental enrichment through monad transformers.

We will be writing an interpreter for a very small language. I will first define the base abstractions and start with a functional code as the implementation base. It does not contain many of the useful stuff like state management, error handling etc., which I will add incrementally using monad transformers. We will see how the core model remains the same, transformers get added in layers and the static type of the interpreter function states explicitly what effects have been added to it.

The Language

Here's the language for which we will be writing the interpreter. Pretty basic stuff with literal integers, variables, addition, λ expressions (abstraction) and function application. By abstraction and application I mean lambda terms .. so a quick definition for the uninitiated ..


- a lambda term may be a variable, x
- if t is a lambda term, and x is a variable, then λx.t is a lambda term (called a lambda abstraction)
- if t and s are lambda terms, then ts is a lambda term (called an application)


and the Scala definitions for the language elements ..

// variable names
type Name = String
  
// Expression types
trait Exp
case class Lit(i: Int) extends Exp
case class Var(n: Name) extends Exp
case class Plus(e1: Exp, e2: Exp) extends Exp
case class Abs(n: Name, e: Exp) extends Exp
case class App(e1: Exp, e2: Exp) extends Exp
  
// Value types
trait Value
case class IntVal(i: Int) extends Value
case class FunVal(e: Env, n: Name, exp: Exp) extends Value

// Environment in which the λ-abstraction will be evaluated
type Env = collection.immutable.Map[Name, Value]

// defining additional data constructors because in Scala
// typeclass resolution and variance often give some surprises with type inferencing

object Values {
  def intval(i: Int): Value = IntVal(i)
  def funval(e: Env, n: Name, exp: Exp): Value = FunVal(e, n, exp)
}

The Reference Implementation

I start with the base implementation, which is a functional model of the interpreter. It contains only the basic stuff for evaluation and has no monadic structure. Incrementally we will start having fun with this ..

def eval0: Env => Exp => Value = { env => exp =>
  exp match {
    case Lit(i) => IntVal(i)
    case Var(n) => (env get n).get
    case Plus(e1, e2) => {
      val IntVal(i1) = eval0(env)(e1)
      val IntVal(i2) = eval0(env)(e2)
      IntVal(i1 + i2)
    }
    case Abs(n, e) => FunVal(env, n, e)
    case App(e1, e2) => {
      val val1 = eval0(env)(e1)
      val val2 = eval0(env)(e2)
      val1 match {
        case FunVal(e, n, exp) => eval0((e + ((n, val2))))(exp)
      }
    }
  }
}

Note we assume that we have the proper matches everywhere - the Map lookup in processing variables (Var) doesn't fail and we have the proper function value when we go for the function application. So things look happy for the correct paths of expression evaluation ..

// Evaluate: 12 + ((λx -> x)(4 + 2))

scala> val e1 = Plus(Lit(12), App(Abs("x", Var("x")), Plus(Lit(4), Lit(2))))
e1: Plus = Plus(Lit(12),App(Abs(x,Var(x)),Plus(Lit(4),Lit(2))))

scala> eval0(collection.immutable.Map.empty[Name, Value])(e1)
res4: Value = IntVal(18)

Go Monadic

Monad transformers give you layers of control over the various aspects of your computation. But for that to happen you need to organize your code in a monadic way. Think of it like this - if your code models the computations of your domain (aka the domain logic) as per the contracts of an abstraction you can very well compose more of similar abstractions in layers without directly poking into the underlying implementation.

Let's do one thing - let's transform the above function into a monadic one that doesn't add any effect. It only sets up the base case for other monad transformers to prepare their playing fields. It's the Identity monad, which simply applies the bound function to its input without any additional computational effect. In scalaz 7 Identity simply wraps a value and provides a map and flatMap for bind.

Here's our next iteration of eval, this time with the Identity monad baked in .. eval0 was returning Value, eval1 returns Identity[Value] - the return type makes this fact explicit that we are now in the land of monads and have wrapped ourselves into a computational structure which can only be manipulated through the bounds of the contract that the monad allows.

type Eval1[A] = Identity[A]

def eval1: Env => Exp => Eval1[Value] = {env => exp =>
  exp match {
    case Lit(i) => intval(i).point[Eval1]
    case Var(n) => (env get n).get.point[Eval1]
    case Plus(e1, e2) => for {
      i <- eval1(env)(e1)
      j <- eval1(env)(e2)
    } yield {
      val IntVal(i1) = i
      val IntVal(i2) = j
      IntVal(i1 + i2)
    }
    case Abs(n, e) => funval(env, n, e).point[Eval1]
    case App(e1, e2) => for {
      val1 <- eval1(env)(e1)
      val2 <- eval1(env)(e2)
    } yield {
      val1 match {
        case FunVal(e, n, exp) => eval1((e + ((n, val2))))(exp)
      }
    }
  }
}

All returns are now monadic, though the basic computation remains the same. The Lit, Abs and the Var cases use the point function (pure in scalaz 6) equivalent to a Haskell return. Plus and App use the for comprehension to evaluate the monadic action. Here's the result on the REPL ..

scala> eval1(collection.immutable.Map.empty[Name, Value])(e1)
res7: Eval1[Value] = scalaz.Identity$$anon$2@18f67fc

scala> res7.value
res8: Value = IntVal(18)

So the Identity monad has successfully installed itself making our computational model like an onion peel on which we can now stack up additional effects.

Handling Errors

In eval1 we have a monadic functional model of our computation. But we have not yet handled any errors that may arise from the computation. And I promised that we will add such effects incrementally without changing the guts of your model.

As a very first step, let's use a monad transformer that helps us handle errors, not by throwing exceptions (exceptions are bad .. right?) but by wrapping the error conditions in yet another abstraction. Needless to say this also has to be monadic because we would like it to compose with our already implemented Identity monad and the others that we will work out later on.

scalaz 7 offers EitherT which we can use as the Error monad transformer. It is defined as ..

sealed trait EitherT[A, F[_], B] {
  val runT: F[Either[A, B]]
  //..
}

It adds the EitherT computation on top of F so that the composed monad will have both the effects. And as with Either we use the Left A for the error condition and the Right B for returning the result. The plus point of using the monad transformer is that this plumbing of the 2 monads is taken care of by the implementation of EitherT, so that we can simply define the following ..

type Eval2[A] = EitherT[String, Identity, A]

def eval2a: Env => Exp => Eval2[Value] = {env => exp =>
  //..
}

The error will be reported as String and the Value will be returned in the Right constructor of Either. Our return type is also explicit in what the function does. You can simply change the return type to Eval2 and keep the rest of the function same as eval1. It works perfectly like the earlier one. Since we have not yet coded explicitly for the error conditions, appropriate error messages will not appear, but the happy paths execute as earlier even with the changed return type. This is because Identity was a monad and so is the newly composed one consisting of Identity and EitherT.

We can run eval2a and the only difference in output will be that the result will be wrapped in a Right constructor ..

scala> val e1 = Plus(Lit(12), App(Abs("x", Var("x")), Plus(Lit(4), Lit(2))))
e1: Plus = Plus(Lit(12),App(Abs(x,Var(x)),Plus(Lit(4),Lit(2))))

scala> eval2a(collection.immutable.Map.empty[Name, Value])(e1)
res31: Eval2[Value] = scalaz.EitherTs$$anon$2@ad2f60

scala> res31.runT.value
res33: Either[String,Value] = Right(IntVal(18))

We can do a couple of more iterations improving upon how we can handle errors using EitherT and issue appropriate error messages to the user. Here's the final version that has all error handling implemented. Note however that the core model remains the same - we have only added the Left handling for error conditions ..

def eval2: Env => Exp => Eval2[Value] = {env => exp =>
  exp match {
    case Lit(i) => intval(i).point[Eval2]

    case Var(n) => (env get n).map(v => rightT[String, Identity, Value](v))
                              .getOrElse(leftT[String, Identity, Value]("Unbound variable " + n))
    case Plus(e1, e2) => 
      val r = 
        for {
          i <- eval2(env)(e1)
          j <- eval2(env)(e2)
        } yield((i, j))

      r.runT.value match {
        case Right((IntVal(i_), IntVal(j_))) => rightT(IntVal(i_ + j_))
        case Left(s) => leftT("type error in Plus" + "/" + s)
        case _ => leftT("type error in Plus")
      }

    case Abs(n, e) => funval(env, n, e).point[Eval2]

    case App(e1, e2) => 
      val r =
        for {
          val1 <- eval2(env)(e1)
          val2 <- eval2(env)(e2)
        } yield((val1, val2))

      r.runT.value match {
        case Right((FunVal(e, n, exp), v)) => eval2(e + ((n, v)))(exp)
        case _ => leftT("type error in App")
      }
  }
}

How about some State ?

Let's add some mutable state in the function using the State monad. So now we need to stack up our pile of transformers with yet another effect. We would like to add some profiling capabilities that track invocation of every pattern in the evaluator. For simplicity we just count the number of invocations as an integer and report it along with the final output. We define the new monad by wrapping a StateT constructor around the innermost monad, Identity. So now our return type becomes ..

type Eval3[A] = EitherT[String, StateTIntIdentity, A]

We layer the StateT between EitherT and Identity - hence we need to form a composition between StateT and Identity that goes as the constructor to EitherT. This is defined as StateTIntIdentity, we make the state an Int. And we define this as a type lambda as follows ..

type StateTIntIdentity[α] = ({type λ[α] = StateT[Int, Identity, α]})#λ[α]

Intuitively our returned value in case of a successful evaluation will be a tuple2 (Either[String, Value], Int), as we will see shortly.

We write a couple of helper functions that manages the state by incrementing a counter and lifting the result into a StateT monad and finally lifting everything into the EitherT.

def stfn(e: Either[String, Value]) = (s: Int) => id[(Either[String, Value], Int)](e, s+1)

def eitherNStateT(e: Either[String, Value]) =
  eitherT[String, StateTIntIdentity, Value](stateT[Int, Identity, Either[String, Value]](stfn(e)))

And here's the eval3 function that does the evaluation along with profiling and error handling ..

def eval3: Env => Exp => Eval3[Value] = {env => exp => 
  exp match {
    case Lit(i) => eitherNStateT(Right(IntVal(i)))

    case Plus(e1, e2) =>
      def appplus(v1: Value, v2: Value) = (v1, v2) match {
        case ((IntVal(i1), IntVal(i2))) => eitherNStateT(Right(IntVal(i1 + i2))) 
        case _ => eitherNStateT(Left("type error in Plus"))
      }
      for {
        i <- eval3(env)(e1)
        j <- eval3(env)(e2)
        v <- appplus(i, j)
      } yield v

    case Var(n) => 
      val v = (env get n).map(Right(_))
                         .getOrElse(Left("Unbound variable " + n))
      eitherNStateT(v)

    case Abs(n, e) => eitherNStateT(Right(FunVal(env, n, e)))

    case App(e1, e2) => 
      def appfun(v1: Value, v2: Value) = v1 match {
        case FunVal(e, n, body) => eval3(e + ((n, v2)))(body)
        case _ => eitherNStateT(Left("type error in App"))
      }

      val s =
        for {
          val1 <- eval3(env)(e1)
          val2 <- eval3(env)(e2)
          v    <- appfun(val1, val2)
        } yield v

      val ust = s.runT.value.usingT((x: Int) => x + 1)
      eitherT[String, StateTIntIdentity, Value](ust)
  }
}

We run the above function through another helper runEval3 that also takes the seed value of the state ..

def runEval3: Env => Exp => Int => (Either[String, Value], Int) = { env => exp => seed => 
  eval3(env)(exp).runT.value.run(seed)
}

Here's the REPL session with runEval3 ..

scala> val e1 = Plus(Lit(12), App(Abs("x", Var("x")), Plus(Lit(4), Lit(2))))
e1: Plus = Plus(Lit(12),App(Abs(x,Var(x)),Plus(Lit(4),Lit(2))))
scala> runEval3(env)(e1)(0)
res25: (Either[String,Value], Int) = (Right(IntVal(18)),8)

// -- failure case --
scala> val e2 = Plus(Lit(12), App(Abs("x", Var("y")), Plus(Lit(4), Lit(2))))
e2: Plus = Plus(Lit(12),App(Abs(x,Var(y)),Plus(Lit(4),Lit(2))))

scala> runEval3(env)(e2)(0)
res27: (Either[String,Value], Int) = (Left(Unbound variable y),7)

In case you are interested the whole code base is there in my github repository. Feel free to check out. I will be adding a couple of more transformers for hiding the environment (ReaderT) and logging (WriterT) and also IO.

Monday, July 11, 2011

Datatype generic programming in Scala - Fixing on Cata

The paper Functional Programming with Overloading and Higher-Order Polymorphism by Mark P Jones discusses recursion and fixpoints in a section Recursion schemes: Functional programming with bananas and lenses with all examples in Haskell. The paper is an excellent read for anyone interested in an overall landscape of functional programming concepts. In case you haven’t read it, stop reading this post and have a look at it.

The section on recursion in the original paper discusses type level fixpoints and how it can be used to define generic abstractions like catamorphism and anamorphism. These are used extensively in datatype generic programming, where you can define generic combinators parameterized by the shape of the data. A very common example of a combinator parameterized by the shape or type constructor is fold, which works with many recursive structures like List, Tree etc.

I have been trying to use Scala to do the same examples that the paper models in the section on recursion. I was curious to find out if Scala's type system is expressive enough to implement higher order abstractions like the type level fixpoint combinator that the paper implements using Haskell. Oliveira and Gibbons also made similar studies on the suitability of Scala for datatype generic programming and have translated Gibbons' ORIGAMI for a subset of the GoF patterns in Scala.

In order to learn type level fixpoints, I started with value level fixpoints using untyped lambda calculus. Once you study the properties of a fixed point for functions, you can find out the mappings with type level fixpoints. The former takes functions and maps types, while the latter takes type constructors and maps kinds.

I start with an introduction and some proofs for value level fixpoints. Most of it is pretty basic stuff - still I wanted to say these things as it may prove useful to someone learning my way. In case you are familiar with this, feel free to skip to the next section.

Value level fixpoint

Lambda calculus helps us split a recursive function definition into 2 parts - a non-recursive computable definition that abstracts the recursive call away to an additional parameter and a fixpoint operator that encodes the recursion part. This way we take the recursion off the main function that we model. Here’s the usual factorial that goes through this process and lands up as a fixed point computation ..

FACT = λn.IF (= n 0) 1 (* n (FACT (- n 1)))) // λ calculus
FACT = (λfac (λn.IF (= n 0) 1 (* n (fac (- n 1)))))) FACT // beta abstraction


Note here we have isolated the main factorial function as an abstraction that does not have any recursion. Call the above equation ..

FACT = H FACT


and this encodes the recursion part. H is a function which when applied to FACT gives back a FACT. We call FACT a fixpoint of H.

What then is a fixpoint combinator ?

A fixpoint combinator is a function Y which takes another function as argument and generates its fixpoint. e.g. in the case above Y when applied on H will give us the fixpoint FACT.

Y H = FACT        // from definition of Y    #1
FACT = H FACT     // from above              #2
Y H = H FACT      // applying #2 in #1       #3
Y H = H (Y H)     // applying #1 in #3       #4


Note we started with FACT as the model of our factorial. It’s an interesting exercise to see that FACT 4 indeed results in 24 following the above formula.

So Y is the magic that helps us express any recursive function as a non-recursive computation. In fact Y can be expressed as a lambda expression without using recursion ..

= (λh. (λx. h(x x)) (λx. h(x x)))


Let’s see what Y H evaluates to ..

Y H
= (λh. (λx. h(x x)) (λx. h(x x))) H    // definition of Y
-> (λx. H(x x)) (λx. H(x x))                 // beta reduction
-> H ((λx. H(x x)) (λx. H(x x)))             // beta reduction
-> H (Y H)


So that’s cool .. we have successfully derived the Y combinator as the lambda expression.

In the above we did the derivation of the fixed point combinator for any recursive function, which is a value. The subject of today’s post is the fixed combinator for types. We will see how the above maps to the same model when applied at the type level.

Y for types

In this post I will model the type level fixpoint combinator in Scala showing that Scala’s type system has the power to express all the data type generic abstractions that Mark does using Haskell.

Here’s what we saw as fixed point for values (functions) ..

Y H = H (Y H)


The fixed point combinator for types is usually called Mu .. In Haskell we will say ..

data Mu f = In ((Mu f))

Note the correspondence .. while Y takes functions and maps across types, Mu takes type constructors and maps across kinds ..

GHCi> :t Y
:: (-> a) -> a
GHCi> :Mu
Fix :: (* -> *) -> *


Now in Scala we model Mu as a case class that takes a type constructor as parameter. Like the paper we only consider unary type constructors - it’s not that difficult to generalize it to higher arities ..

case class Mu[F[_]](out: F[Mu[F]])


Just like with value level fixpoint Y we can isolate the recursive function into a non recursive computation, we can do the same thing on types with Mu. Consider the following data type declaration for modeling Natural Numbers ..

trait Nat
case object Zero extends Nat
case class Succ(n: Nat) extends Nat


.. a recursive type .. let’s break the recursion by introducing an additional type parameter ..

trait NatF[+S]
case class Succ[+S](s: S) extends NatF[S]
case object Zero extends NatF[Nothing]

type Nat = Mu[NatF]


.. and modeling the actual type Nat as a fixpoint of NatF.

Mu is actually a functor fixpoint, which means that it works on functors (more specifically covariant functors). In our case we need to define a functor for NatF. No problem .. scalaz is there .. and here’s the functor instance for NatF ..

implicit object functorNatF extends Functor[NatF] {
  def fmap[A, B](r: NatF[A], f: A => B) = r match {
    case Zero => Zero
    case Succ(s) => Succ(f(s))
  }
}


And we define a couple of convenience functions for the zero natural number and the successor function ..

def zero: Nat = Mu[NatF](Zero)
def succ: Nat => Nat = (s: Nat) => Mu[NatF](Succ(s))


What we did with Mu

So far we defined a datatype as a fixpoint of a functor. Instead of making the datatype definition recursive we abstracted the recursion within the fixpoint combinator. And we did all these as a general strategy that can be applied to many other recursive datatypes.

Let’s apply the same pattern to a List data type .. ok we use an IntList, a list of integers, following the paper.

trait IntListF[+S]
case object Nil extends IntListF[Nothing]
case class Cons[+S](x: Int, xs: S) extends IntListF[S]

// the integer list as a fixpoint of IntListF
type IntList = Mu[IntListF]

// convenience functions for the constructors
def nil = Mu[IntListF](Nil)
def cons = (x: Int) => (xs: IntList) => Mu[IntListF](Cons(x, xs))


.. and the functor instance ..

implicit object functorIntListF extends Functor[IntListF] {
  def fmap[A, B](r: IntListF[A], f: A => B) = r match {
    case Nil => Nil
    case Cons(n, x) => Cons(n, f(x))
  }
}


Note the similarity in structure for both the datatypes Nat and IntList and how the type constructors in both the cases IntListF and NatF determine the shape of the computation for the respective datatypes. This means that the datatype definition is parameterized by a type constructor that determines the shape of the data that will be modeled by the datatype.

And now for the cata

Now with the above abstraction of Mu in place, we can define a combinator that can be shown to be more generalized than a fold .. it’s catamorphism, which we define as ..

def cata[A, F[_]](f: F[A] => A)(t: Mu[F])(implicit fc: Functor[F]): A = {
  f(fc.fmap(t.out, cata[A, F](f)))
}


For recursive datatypes folds will have different types, but cata is a more general abstraction that is capable of defining all of the operations on the datatype. cata is similar to a fold, just more generic. The signature of fold varies with the datatype on which it's applied. But the above cata definition is generic enough to model functions on type constructors for which there's a matching functor instance.

In this blog post Tony Morris discusses a catamorphism on an Option data type in Scala. He defines a cata which is specific for that data type. The above definition is at a higher level of abstraction and is parameterized on the shape of the data that it takes. The following snippets use the same cata to define functions for Nat as well as IntList. Datatype generic abstractions FTW.

Have a look at the following functions on Nat, all defined in terms of the cata combinator ..

def fromNat = cata[Int, NatF] {
  case Zero => 0
  case Succ(n) => 1 + n
} _ 

scala> fromNat(succ(succ(zero)))
res14: Int = 2

def addNat(m: Nat, n: Nat) = cata[Nat, NatF] {
  case Zero => m
  case Succ(x) => succ(x)
} (n)

scala> fromNat(addNat(succ(zero), succ(zero)))
res15: Int = 2
scala> fromNat(addNat(succ(zero), succ(succ(zero))))
res16: Int = 3

def mulNat(m: Nat, n: Nat) = cata[Nat, NatF] {
  case Zero => zero
  case Succ(x) => addNat(m, x)
} (n)

scala> fromNat(mulNat(succ(succ(succ(zero))), succ(succ(zero))))
res1: Int = 6
scala> fromNat(mulNat(succ(succ(succ(zero))), zero))
res2: Int = 0

def expNat(m: Nat, n: Nat) = cata[Nat, NatF] {
  case Zero => succ(zero)
  case Succ(x) => mulNat(m, x)
} (n)

scala> fromNat(expNat(succ(succ(succ(zero))), zero))
res0: Int = 1
scala> fromNat(expNat(succ(succ(succ(zero))), succ(succ(zero))))
res1: Int = 9


.. and some more on IntList using the same cata ..

def sumList = cata[Int, IntListF] {
  case Nil => 0
  case Cons(x, n) => x + n
} _

scala> sumList(cons(1)(cons(2)(cons(3)(nil))))
res1: Int = 6

def len = cata[Nat, IntListF] {
  case Nil => zero
  case Cons(x, xs) => succ(xs)
} _

scala> fromNat(len(cons(1)(cons(2)(cons(3)(nil)))))
res1: Int = 3

Monday, June 13, 2011

Composing Heterogeneous DSLs in Scala

When we use DSLs to model business rules, we tend to use quite a few of them together. We may use a DSL for computing date/time, another one for manipulating money with currency, and a few others for implementing the actual rules of the domain. And not all of them will be developed by us - third party DSLs play an equally important role out here.

But when we use a bunch of heterogeneous DSLs, our application needs to be flexible enough to adapt to the independent evolution paths of each of them. We need to be able to embed entire DSLs within our application structure and yet not be coupled to the implementation of any of them.

In this blog post, I will demonstrate how we can organize heterogeneous DSLs hierarchically and achieve the above goals of keeping individual implementations decoupled from each other. This is also called representation independence and is yet another strategy of programming to an interface. I will use Scala as the implementation language and use the power of Scala's type system to compose these DSLs together.

The most interesting part of this implementation will be to ensure that the same representation of the composed DSL is used even when you extend your own DSL.

We are talking about accounts and how to compute the balance that a particular account holds. We have a method balanceOf that returns an abstraction named Balance. No idea about the implementation details of Balance though ..

trait Account {
  val bal: Balances
  import bal._

  def balanceOf: Balance
}


Account is part of our own DSL. Note the abstract val Balances - it's a DSL which we embed within our own abstraction that helps you work with Balance. You can make a Balance abstraction, manipulate balances, change currencies etc. In short it's a utility general purpose DSL that we can frequently plug in to our application structure. Here we stack it hierarchically within our own DSL.

Here's how Balances look ..

trait Balances {
  type Balance

  def balance(amount: Int, currency: String): Balance
  def amount(b: Balance): Int
  def currency(b: Balance): String
  def convertTo(b: Balance, currency: String): Balance
}


Note Balance is abstracted within Balances. And we have a host of methods for manipulating a Balance.

Let's now have a look at a sample implementation of Balances, which also concretizes a Balance implementation ..

class BalancesImpl extends Balances {
  case class BalanceImpl(amount: Int, currency: String)
  type Balance = BalanceImpl

  def balance(amount: Int, currency: String): Balance = {
    BalanceImpl(amount, currency)
  }

  def amount(b: Balance) = b.amount
  def currency(b: Balance) = b.currency

  def convertTo(b: Balance, toCurrency: String): Balance = {
    BalanceImpl(b.amount * 2, toCurrency)
  }
}


Note Balances is a complete DSL totally decoupled from our own DSL that has the Account abstraction.

Now on to some concrete Account implementations ..

trait BankAccount extends Account {
  // concrete implementation of Balances
  val bal = new BalancesImpl

  // object import syntax
  import bal._

  // dummy implementation but uses the Balances DSL
  override def balanceOf = balance(10000, "USD")
}
object BankAccount extends BankAccount


It's a bank account that uses a concrete implementation of the Balances DSL. Note the balanceOf method uses the balance() method of the Balances DSL accessible through the object import syntax.

Now the fun part. My BankAccount uses one specific implementation of Balances. I would like to add a few decorators to my account, which will have richer versions of the API implementation. How do I ensure that all decorators that I may define for BankAccount also get to use the same DSL implementation for Balances?

Here's how .. Not only do we compose decorator and decoratee abstractions hierarchically, we use Scala's singleton type to ensure that the same representation of the Balances DSL gets to flow from the decoratee to the decorator.

// decorator
trait InterestBearing extends Account {
  // decoratee
  val semantics: Account

  // singleton type pulls the same representation from up
  val bal: semantics.bal.type

  // object import ensures that any method of
  // Balances that we use below comes from the same singleton type
  import bal._

  def interest: Int = 100 // dummy implementation

  override def balanceOf = {
    val b = semantics.balanceOf
    balance(amount(b) + interest, currency(b))
  }
}


So we have done a hierarchical composition of two heterogeneous DSLs and ensured that a single representation of one DSL is used uniformly within the other even in the face of extensions and decorations. The process has been made easy by the power of Scala's static type system.

Now we can have a concrete instance of an interest bearing bank account as a single Scala module ..

object InterestBearingBankAccount extends InterestBearing {
  val semantics = BankAccount
  val bal: semantics.bal.type = semantics.bal
}