Tuesday, July 06, 2010

Refactoring into Scala Type Classes

A couple of weeks back I wrote about type class implementation in Scala using implicits. Type classes allow you to model orthogonal concerns of an abstraction without hardwiring it within the abstraction itself. This takes the bloat away from the core abstraction implementation into separate independent class structures. Very recently I refactored Akka actor serialization and gained some real insights into the benefits of using type classes. This post is a field report of the same.

Inheritance and traits looked good ..

.. but only initially. Myself and Jonas Boner had some cool discussions on serializable actors where the design we came up with looked as follows ..

trait SerializableActor extends Actor 
trait StatelessSerializableActor extends SerializableActor

trait StatefulSerializerSerializableActor extends SerializableActor {
  val serializer: Serializer
  //..
}

trait StatefulWrappedSerializableActor extends SerializableActor {
  def toBinary: Array[Byte]
  def fromBinary(bytes: Array[Byte])
}

// .. and so on 

All these traits make the concerns of serializability just too coupled with the core Actor implementation. And with various forms of serializable actors, clearly we were running out of class names. One of the wisdoms that the GoF Patterns book taught us was that when you struggle naming your classes using inheritance, you're definitely doing it wrong! Look out for other ways that separate the concerns more meaningfully.

With Type Classes ..

We took the serialization stuff out of the core Actor abstraction into a separate type class.

/**
 * Type class definition for Actor Serialization
 */
trait FromBinary[<: Actor] {
  def fromBinary(bytes: Array[Byte], act: T): T
}

trait ToBinary[<: Actor] {
  def toBinary(t: T): Array[Byte]
}

// client needs to implement Format[] for the respective actor
trait Format[<: Actor] extends FromBinary[T] with ToBinary[T]

We define 2 type classes FromBinary[T <: Actor] and ToBinary[T <: Actor] that the client needs to implement in order to make actors serializable. And we package them together as yet another trait Format[T <: Actor] that combines both of them.

Next we define a separate module that publishes APIs to serialize actors that use these type class implementations ..

/**
 * Module for actor serialization
 */
object ActorSerialization {

  def fromBinary[<: Actor](bytes: Array[Byte])
    (implicit format: Format[T]): ActorRef = //..

  def toBinary[<: Actor](a: ActorRef)
    (implicit format: Format[T]): Array[Byte] = //..

  //.. implementation
}

Note that these type classes are passed as implicit arguments that the Scala compiler will pick up from the surrounding lexical scope. Here's a sample test case which implements the above strategy ..

A sample actor with encapsulated state. Note that we no longer have any incidental complexity of my actor having to inherit from any specialized Actor class ..

class MyActor extends Actor {
  var count = 0

  def receive = {
    case "hello" =>
      count = count + 1
      self.reply("world " + count)
  }
}

and the client implements the type class for protocol buffer based serialization and package it as a Scala module ..

object BinaryFormatMyActor {
  implicit object MyActorFormat extends Format[MyActor] {
    def fromBinary(bytes: Array[Byte], act: MyActor) = {
      val p = Serializer.Protobuf
                        .fromBinary(bytes, Some(classOf[ProtobufProtocol.Counter]))
                        .asInstanceOf[ProtobufProtocol.Counter]
      act.count = p.getCount
      act
    }
    def toBinary(ac: MyActor) =
      ProtobufProtocol.Counter.newBuilder.setCount(ac.count).build.toByteArray
  }
}

We have a test snippet that uses the above type class implementation ..

import ActorSerialization._
import BinaryFormatMyActor._

val actor1 = actorOf[MyActor].start
(actor1 !! "hello").getOrElse("_") should equal("world 1")
(actor1 !! "hello").getOrElse("_") should equal("world 2")

val bytes = toBinary(actor1)
val actor2 = fromBinary(bytes)
actor2.start
(actor2 !! "hello").getOrElse("_") should equal("world 3")

Note that the state is correctly serialized by toBinary and then subsequently de-serialized to get the updated value of the Actor state.

This refactoring has made the core actor implementation much cleaner moving away the concerns of serialization to a separate abstraction. The client code also becomes cleaner in the sense that the client actor definition does not include details of how the actor state is being serialized. Scala's power of implicit arguments and executable modules made this type class based implementation possible.



6 comments:

Martin K. said...

A nice approach that could be applied to different areas where serialization is needed.

I wonder why FromBinary, ToBinary, and Format require their type variable to be subtype of Actor.

I see the requirement for ActorSerialization, but not for the above.

Debasish said...

Absolutely +1 on your observation on the type constraints for FromBinary and ToBinary. In my case I did it only for actor serialization, hence the constraints. They can be moved to the Scala module ActorSerialization level as well. Will do the refactoring in Akka soon .. Thanks.

Alex said...

Using static dispatch (via type classes) seems rather limiting. Are there not use-cases in Akka where the specific type of actor is not known at compile-time?

Trond said...

Two questions: Are there any performance implications of using implicit, and does the currying gets "optimized" by the compiler?

Debasish said...

Trond - everything is at the compiler level. No performance overhead runtime.

Justin Wick said...

This appears to be the same way that type classes are used by the excellent sbinary library.