Sunday, September 06, 2009

Side-effects with Kestrel in Scala

Consider the following piece of logic that we frequently come across in codebases ..

  1. val x = get an instance, either create it or find it

  2. manipulate x with post-creation activities (side-effects)

  3. use x


Step 2 is only for some side-effecting operations, maybe on the instance itself or for some other purposes like logging, registering, writing to database etc. While working on the serialization framework sjson, I have been writing pieces of code that follows exactly the above pattern to create Scala objects out of JSON structures. Now if you notice the above 3 steps, step 2 looks like being a part of the namespace which calls the function that sets up x. But logically step 2 needs to be completed before we can use x. Which means that step 2 is also a necessary piece of logic that needs to be completed before we hand over the constructed instance x to the calling context.

One option of course is to make Step 2 a part of the function in Step 1. But this is not always feasible, since Step 2 needs access to the context of the caller namespace.

Let's look at an idiom in Scala that expresses the above behavior more succinctly and leads us into implementing one of the most popularly used objects in combinatory logic.

Consider the following method in Scala that creates a new instance of a class with the arguments passed to it ..


def newInstance[T](args: Array[AnyRef])(implicit m: Manifest[T]): T = {
  val constructor = 
    m.erasure.getDeclaredConstructors.first
  constructor.newInstance(args: _*).asInstanceOf[T]
}



I can use it like ..

newInstance[Person](Array("ghosh", "debasish"))

for a class Person defined as ..

case class Person(lastName: String, firstName: String)

It's often the case that I would like to have some operations on the new instance after its creation which will be pure side-effects. It may or may not mutate the new instance, but will be done in the context of the new instance. I can very well do that like ..

val persons = new ListBuffer[Person]()
val p = newInstance[Person](Array("ghosh", "debasish"))
persons += p  // register to the list
persons.foreach(mail("New member has joined: " + p)) // new member mail to all
//.. other stuff


It works perfectly .. but we can make the code more expressive if we can somehow be explicit about the context of the block of code that needs to go with every new instance of Person being created. Maybe something like ..

newInstance[Person](Array("ghosh", "debasish")) { p =>
  persons += p
  persons.foreach(mail("New member has joined: " + p))
  //.. other stuff
}


This clearly indicates that the side-effecting steps of adding to the global list of persons or sending out a mail to every member is also part of the creation process of the new person. The effect is the same as the earlier example, only that it delineates the context more clearly. Though at the end of it all, it returns only the instance that it creates.

Consider another example of a good old Java bean ..

class Address {
  private String street;
  private String houseNumber;
  private String city;
  private String zip;

  //..
  //.. getters and setters
}


Working with a reflection based library it's not uncommon to see code that instantiates the bean using the default constructor and then allow clients to set the instance up with custom values .. something like ..

var p = Person(..)
val pAddress =
  newInstance[Address](null) {=>
    a.setStreet("Market Street")
    a.setHouseNumber("B/102")
    a.setCity("San Francisco")
    a.setZip("98032")
    p.address = a
    p.mail("Your address has been changed to: " + a)
  }


Once again the block is only for side-effects, which can contain lots of other custom codes that depends on the caller's context. Make it more concise, DSLish using the object import syntax of Scala ..

var p = Person(..)
val pAddress =
  newInstance[Address](null) {=>
    import a._
    setStreet("Market Street")
    setHouseNumber("B/102")
    setCity("San Francisco")
    setZip("98032")
    p.address = a
    p.mail("Your address has been changed to: " + a)
  }


Looks like a piece of idiom that can be effective as part of your programming repertoire. Here is the version of newInstance that allows you to make the above happen ..


def newInstance[T](args: Array[AnyRef])(op: T => Unit)(implicit m: Manifest[T]): T = {
  val constructor = 
    m.erasure.getDeclaredConstructors.first
  val v = constructor.newInstance(args: _*).asInstanceOf[T]
  op(v)
  v
}



Looking carefully at newInstance I realized that it is actually the Kestrel combinator that Raymond Smullyan explains so eloquently in his amazing book To Mock a Mockingbird. A bird K is called a Kestrel if for any birds x and y, (Kx)y = x. And that's exactly what's happening with newInstance. It does everything you pass onto the block, but ultimately returns the new instance that it creates. A nice way to plug in some side-effects. Reg has blogged about Kestrels in Ruby - the tap method in Ruby 1.9 and returning in Rails. These small code pieces may not be as ceremonious as the popularly used design patterns, but just as effective and provide deep insights into how our code needs to be structured for expressivity. Next time you discover any such snippet that you find useful for sharing, feel free to write a few lines of blog about it .. the community will love it ..

8 comments:

Anonymous said...

why would I use
val x = newInstance[javax.swing.JTextField](Array(int2Integer(20))) { tf=>
tf.setBackground(java.awt.Color.WHITE)
}

instead of
val y = new javax.swing.JTextField(20) {
setBackground(java.awt.Color.WHITE)
}

I still get the side-effect within the construction scope, which I think was the original intent.

Unknown said...

As I mentioned in the post, this idiom is most effective when you are writing reflection based libraries where you instantiate classes based on reflection. Methods like newInstance comes in handy in such cases. And the kestrel can be used to apply side-effects in the proper context. And this idiom is not specific to instantiation only. You can also write Kestrel combinators for any other generic operation for which you would like to have some side-effects before handing it over to the user.

Seth Tisue said...

"You can also write Kestrel combinators for any other generic operation", or you could just write it once and be done:

def kestrel[T](x: T)(f: T => Unit) = { f(x); x }

a handy debugging tool using this:

def printing[T](x: T) = kestrel(x)(println)

so then e.g.:

scala> printing(3 + 3) + printing(2 + 2)
6
4
res6: Int = 10

Unknown said...

Seth -

It may not be as simple as you have defined with kestrel taking a single argument besides the closure. Have a look at typical Rails code that uses Kestrel like the following:

bank = returning Bank.find(...) do |b|
log "bank #{b} found"
end.bank

Here find can take various arguments. Hence I mentioned that it can be as generic but depending on the application or the domain.

Seth Tisue said...

That example works just fine with my version. You can pass any expression as the first argument:

kestrel(Bank.find(...)) { b =>
...
}

Unknown said...

oops .. sure it does. I missed it .. actually your implementation works fine for all kestrels that passes the newly instantiated / found object into the side-effecting block. I was going through Reg's blog and discovered this interesting variation in Ruby ..

Contact = Struct.new(:first, :last, :email) do
  def to_hash Hash [*members.zip(values).flatten]
  end
end

Struct takes an initializer block, but it doesn't pass the new class to the block as a parameter, it evaluates the block in the context of the new class.

These are some of the non-generic variations that we can have in implementing Kestrel. But your solution looks good for the majority of use cases. Thanks.

Prof. Beatriz Rodrigues said...

Very useful !!!
But coming from Java I am still struggling with "Any"s, like in this case:

case class Person(age: Int, name: String) {
println ("in person")
}

def newInstance[T](args: scala.Array[AnyRef])
(implicit m: scala.reflect.Manifest[T]): T = {
val constructor = m.erasure.getDeclaredConstructors.first
constructor.newInstance(args: _*).asInstanceOf[T]
}

//now:
newInstance[Person](
scala.Array(new java.lang.Integer(44), "Martin Odersky"))

//why not:
newInstance[Person](
scala.Array(44, "Martin Odersky"))


I expected automatic / implicit casting to RichInt, but this is not what the 2.7.6 compiler requires ?!

jherber said...

see also:
http://weblog.jamisbuck.org/2006/10/27/mining-activesupport-object-returning