Monday, February 18, 2008

Safe HashMap - The Scala Way

Erik extends java.util.HashMap to implement null safe gets and puts in Java. The following piece of code is the native Java snippet that he compacts using SafeHashMap. The code iterates through a List of objects and populates a HashMap of Lists.


HashMap<String, List<String>> h = new HashMap<String, List<String>>();

for(Thing v : list) {
    List<String> ar = h.get(v.foo);
    if (ar == null) {
        ar = new ArrayList<String>();
        h.put(v.foo, ar);
    }
    ar.add(v.bar);
}



and his version using SafeHashMap ..


SafeHashMap<String, List<String>> h = new SafeHashMap<String, List<String>>(new ArrayList<String>());

for(Thing v : list) {
    h.get(v.foo).add(v.bar);
}



Obviously his solution is verbose enough (the code for SafeHashMap has been left out for brevity) because of the lack of nullable types in Java. Scala handles this nicely through the Option data type. Here is how I would implement the same in Scala ..


val list = List(...)
val h = new HashMap[String, List[String]]()

list foreach(=> names.put(v.foo, v.bar :: h.get(v.foo).getOrElse(List[String]())))



Scala's Option data type helps model optional values. An Option can be of the form Some(x) where x is the actual value, or the None object, which represents a missing value. And the Option class also supports a method getOrElse, that allows users to return the result of evaluating a default expression in case the value is empty. The above snippet uses getOrElse to construct the List in case of missing values. Also, the default expression in getOrElse uses call-by-name and hence evaluated only if the value is empty.

Is this idiomatic enough ?

5 comments:

Blair Zajac said...

Have you looked at the Scala Java Compatibility Layer?

http://www.scala-lang.org/docu/files/api/scala/collection/jcl$content.html

The only thin missing is it doesn't contain wrappers for new classes in JDK 5, such as ConcurrentHashMap.

Unknown said...

Here are 2 alternative implementations suggested elsewhere in response to my post:

Rhoomba suggests in reddit:

You also could just override the "default" method on HashMap:

val h = new HashMap[String, List[String]]() {
  override def default(key : String) = List[String]()
}

and Erkki Lindpere suggests in comments to Eugene's post:

import scala.collection.mutable._

val list = List(a, b, c, ...)
val h = new HashMap[String, ListBuffer[String]]()

for (v <- list) {
  h.getOrElseUpdate(v.foo, new ListBuffer[String]()) += v.bar
}

(ListBuffer is the mutable version of List, which is more appropriate in this case)

Nice ! The best thing about a language like Scala that offers higher level of abstractions is the choice of implementations that the basic language provides.

Unknown said...

The last time, I've needed this in Scala, I just overrided the default method. In Java, Google Collections' Multimap is also useful.

Unknown said...

@rafael:
Sure .. overriding the default() method is what rhoomba demonstrated in reddit (see my earlier comment).

Regarding MultiMap in google collections, it is definitely an option in Java. But what I like about languages like Scala is the flexibility that it provides when it comes to extending existing generic components. You do not need to have a separate class for extending an existing class' functionality. I plan to have a separate blog on this.

Ben said...

With regards to overriding default(): This works as long as you don't want to build up an immutable Map. With an immutable map, you return the new Map after each addition of a key,value pair. This new Map will be a standard Map without the override. Thus null key lookups will work the first time but not subsequent times!