Ruminations of a Programmer: February 2013

Friday, February 15, 2013

A DSL with an Endo - monoids for free

When we design a domain model, one of the issues that we care about is abstraction of implementation from the user level API. Besides making the published contract simple, this also decouples the implementation and allows post facto optimization to be done without any impact on the user level API.

Consider a class like the following ..

// a sample task in a project
case class Task(name: String) 

// a project with a list of tasks & dependencies amongst the
// various tasks
case class Project(name: String, 
                   startDate: java.util.Date, 
                   endDate: Option[java.util.Date] = None, 
                   tasks: List[Task] = List(), 
                   deps: List[(Task, Task)] = List())

We can always use the algebraic data type definition above to add tasks and dependencies to a project. Besides being cumbersome as a user level API, it also is a way to program too close to the implementation. The user is coupled to the fact that we use a List to store tasks, making it difficult to use any alternate implementation in the future. We can offer a Builder like OO interface with fluent APIs, but that also adds to the verbosity of implementation, makes builders mutable and is generally more difficult to compose with other generic functional abstractions.

Ideally we should be having a DSL that lets users create projects and add tasks and dependencies to them.

In this post I will discuss a few functional abstractions that will stay behind from the user APIs, and yet provide the compositional power to wire up the DSL. This is a post inspired by this post which discusses a similar DSL design using Endo and Writers in Haskell.

Let's address the issues one by one. We need to accumulate tasks that belong to the project. So we need an abstraction that helps in this accumulation e.g. concatenation in a list, or in a set or in a Map .. etc. One abstraction that comes to mind is a Monoid that gives us an associative binary operation between two objects of a type that form a monoid.

trait Monoid[T] {
  def append(m1: T, m2: T): T
  def zero: T
}

A List is a monoid with concatenation as the append. But since we don't want to expose the concrete data structure to the client API, we can talk in terms of monoids.

The other data structure that we need is some form of an abstraction that will offer us the writing operation into the monoid. A Writer monad is an example of this. In fact the combination of a Writer and a Monoid is potent enough to have such a DSL in the making. Tony Morris used this combo to implement a logging functionality ..

for {
  a <- k withvaluelog ("starting with " + _)
  b <- (a + 7) withlog "adding 7"
  c <- (b * 3).nolog
  d <- c.toString.reverse.toInt withvaluelog ("switcheroo with " + _)
  e <- (d % 2 == 0) withlog "is even?"
} yield e

We could use this same technique here. But we have a problem - Project is not a monoid and we don't have a definition of zero for a Project that we can use to make it a Monoid. Is there something that would help us get a monoid from Project i.e. allow us to use Project in a monoid ?

Enter Endo .. an endomorphism which is simply a function that takes an argument of type T and returns the same type. In Scala, we can state this as ..

sealed trait Endo[A] {
  // The captured function
  def run: A => A
  //..
}

Scalaz defines Endo[A] and provides a lot of helper functions and syntactic sugars to use endomorphisms. Among its other properties, Endo[A] provides a natural monoid and allows us to use A in a Monoid. In other words, endomorphisms of A form a monoid under composition. In our case we can define an Endo[Project] as a function that takes a Project and returns a Project. We can then use it with a Writer (as above) and implement the accumulation of tasks within a Project.

Exercise: Implement Tony Morris' logger without side-effects using an Endo.

Here's how we would like to accumulate tasks in our DSL ..

for {
  _ <- task("Study Customer Requirements")
  _ <- task("Analyze Use Cases")
  a <- task("Develop code")
} yield a

Let's define a function that adds a Task to a Project ..

// add task to a project
val withTask = (t: Task, p: Project) => p.copy(tasks = t :: p.tasks)

and use this function to define the DSL API task which makes an Endo[Project] and passes it as a Monoid to the Writer monad. In the following snippet, (p: Project) => withTask(t, p) is a mapping from Project => Project, which gets converted to an Endo and then passed to the Writer monad for adding to the task list of the Project.

def task(n: String): Writer[Endo[Project], Task] = {
  val t = Task(n)
  for {
    _ <- tell(((p: Project) => withTask(t, p)).endo)
  } yield t
}

The DSL snippet above is a monad comprehension. Let's add some more syntax to the DSL by defining dependencies of a Project. That's also a mapping from one Project state to another and can be realized using a similar function like withTask ..

// add project dependency
val withDependency = (t: Task, on: Task, p: Project) => 
  p.copy(deps = (t, on) :: p.deps)

.. and define a function dependsOn to our DSL that allows the user to add the explicit dependencies amongst tasks. But this time instead of making it a standalone function we will make it a method of the class Task. This is only for getting some free syntactic sugar in the DSL. Here's the modified Task ADT ..

case class Task(name: String) {
  def dependsOn(on: Task): Writer[Endo[Project], Task] = {
    for {
      _ <- tell(((p: Project) => withDependency(this, on, p)).endo)
    } yield this
  }
}

Finally we define the last API of our DSL that glues together the building of the Project and the addition of tasks and dependencies without directly coupling the user to some of the underlying implementation artifacts.

def project(name: String, startDate: Date)(e: Writer[Endo[Project], Task]) = {
  val p = Project(name, startDate)
  e.run._1(p)
}

And we can finally create a Project along with tasks and dependencies using our DSL ..

project("xenos", now) {
  for {
    a <- task("study customer requirements")
    b <- task("analyze usecases")
    _ <- b dependsOn a
    c <- task("design & code")
    _ <- c dependsOn b
    d <- c dependsOn a
  } yield d
}

In case you are interested I have the whole working example in my github repo.

Friday, February 01, 2013

Modular Abstractions in Scala with Cakes and Path Dependent Types

I have been trying out various options of implementing the Cake pattern in Scala, considered to be one of the many ways of doing dependency injection without using any additional framework. There are other (more functional) ways of doing the same thing, one of which I blogged about before and also talked about in a NY Scala meetup. But I digress ..

Call it DI or not, the Cake pattern is one of the helpful techniques to implement modular abstractions in Scala. You weave your abstract components (aka traits), layering on the dependencies and commit to implementations only at the end of the world. I was trying to come up with an implementation that does not use self type annotations. It's not that I think self type annotations are kludgy or anything but I don't find them used elsewhere much besides the Cake pattern. And of course mutually recursive self annotations are a code smell that makes your system anti-modular.

In the following implementation I use path dependent types, which have become a regular feature in Scala 2.10. Incidentally it was there since long back under the blessings of an experimental feature, but has come out in public only in 2.10. The consequence is that instead of self type annotations or inheritance I will be configuring my dependencies using composition.

Let me start with some basic abstractions of a very simple domain model. The core component that I will build is a service that reports the portfolio of clients as a balance. The example has been simplified for illustration purposes - the actual real life model has a much more complex implementation.

A Portfolio is a collection of Balances. A Balance is a position of an Account in a specific Currency as on a particular Date. Expressing this in simple terms, we have the following traits ..

// currency
sealed trait Currency
case object USD extends Currency
case object EUR extends Currency
case object AUD extends Currency

//account
case class Account(no: String, name: String, openedOn: Date, status: String)

trait BalanceComponent {
  type Balance

  def balance(amount: Double, currency: Currency, asOf: Date): Balance
  def inBaseCurrency(b: Balance): Balance
}

The interesting point to note is that the actual type of Balance has been abstracted in BalanceComponent, since various services may choose to use various representations of a Balance. And this is one of the layers of the Cake that we will mix finally ..

Just a note for the uninitiated, a base currency is typically considered the domestic currency or accounting currency. For accounting purposes, a firm may use the base currency to represent all profits and losses. So we may have some service or component that would like to have the balances reported in base currency.

trait Portfolio {
  val bal: BalanceComponent
  import bal._

  def currentPortfolio(account: Account): List[Balance]
}

Portfolio uses the abstract BalanceComponent and does not commit to any specific implementation. And the Balance in the return type of the method currentPortfolio is actually a path dependent type, made to look nice through the object import syntax.

Now let's have some standalone implementations of the above components .. we are still not there yet to mix the cake ..

// report balance as a TUPLE3 - simple
trait SimpleBalanceComponent extends BalanceComponent {
  type Balance = (Double, Currency, Date)

  override def balance(amount: Double, currency: Currency, asOf: Date) = 
    (amount, currency, asOf)
  override def inBaseCurrency(b: Balance) = 
    ((b._1) * baseCurrencyFactor.get(b._2).get, baseCurrency, b._3)
}

// report balance as an ADT
trait CustomBalanceComponent extends BalanceComponent {
  type Balance = BalanceRep

  // balance representation
  case class BalanceRep(amount: Double, currency: Currency, asOf: Date)

  override def balance(amount: Double, currency: Currency, asOf: Date) = 
    BalanceRep(amount, currency, asOf)
  override def inBaseCurrency(b: Balance) = 
    BalanceRep((b.amount) * baseCurrencyFactor.get(b.currency).get, baseCurrency, b.asOf)
}

And a sample implementation of ClientPortfolio that adds logic without yet commiting to any concrete type for the BalanceComponent.

trait ClientPortfolio extends Portfolio {
  val bal: BalanceComponent
  import bal._

  override def currentPortfolio(account: Account) = {
    //.. actual impl will fetch from database
    List(
      balance(1000, EUR, Calendar.getInstance.getTime),
      balance(1500, AUD, Calendar.getInstance.getTime)
    )
  }
}

Similar to ClientPortfolio, we can have multiple implementations of Portfolio reporting that reports balances in various forms. So our cake has started taking shape. We have the Portfolio component and the BalanceComponent already weaved in without any implementation. Let's add yet another layer to the mix, maybe for fun - a decorator for the Portfolio.

We add Auditing as a component which can decorate *any* Portfolio component and report the balance of an account in base currency. Note that Auditing needs to abstract implementations of BalanceComponent as well as Portfolio since the idea is to decorate any Portfolio component using any of the underlying BalanceComponent implementations.

Many cake implementations use self type annotations (or inheritance) for this. I will be using composition and path dependent types.

trait Auditing extends Portfolio {
  val semantics: Portfolio
  val bal: semantics.bal.type
  import bal._

  override def currentPortfolio(account: Account) = {
    semantics.currentPortfolio(account) map inBaseCurrency
  }
}

Note how the Auditing component uses the same Balance implementation as the underlying decorated Portfolio component, enforced through path dependent types.

And we have reached the end of the world without yet committing to any implementation of our components .. But now let's do that and get a concrete service instantiated ..

object SimpleBalanceComponent extends SimpleBalanceComponent
object CustomBalanceComponent extends CustomBalanceComponent

object ClientPortfolioAuditService1 extends Auditing {
  val semantics = new ClientPortfolio { val bal = SimpleBalanceComponent }
  val bal: semantics.bal.type = semantics.bal
}

object ClientPortfolioAuditService2 extends Auditing {
  val semantics = new ClientPortfolio { val bal = CustomBalanceComponent }
  val bal: semantics.bal.type = semantics.bal
}

Try out in your Repl and see how the two services behave the same way abstracting away all implementations of components from the user ..

scala> ClientPortfolioAuditService1.currentPortfolio(Account("100", "dg", java.util.Calendar.getInstance.getTime, "a"))
res0: List[(Double, com.redis.cake.Currency, java.util.Date)] = List((1300.0,USD,Thu Jan 31 12:58:35 IST 2013), (1800.0,USD,Thu Jan 31 12:58:35 IST 2013))

scala> ClientPortfolioAuditService2.currentPortfolio(Account("100", "dg", java.util.Calendar.getInstance.getTime, "a"))
res1: List[com.redis.cake.ClientPortfolioAuditService2.bal.Balance] = List(BalanceRep(1300.0,USD,Thu Jan 31 12:58:46 IST 2013), BalanceRep(1800.0,USD,Thu Jan 31 12:58:46 IST 2013))

The technique discussed above is inspired from the paper Polymoprhic Embedding of DSLs. I have been using this technique for quite some time and I have discussed a somewhat similar implementation in my book DSLs In Action while discussing internal DSL design in Scala.

And in case you are interested in the full code, I have uploaded it on my Github.