Friday, October 03, 2008

Functional List with fast Random Access

I needed a List with fast random access capabilities. Standard implementations of a List takes O(i) to access the ith element. I am using Scala and arrays do not really cut as a well-behaved functional data structure. Besides I needed dynamic resizing, persistence and good worst case complexity. Anyway I was trying to justify implementing Okasaki's Purely Functional Random Access Lists ..

The data structure provides O(log n) random access along with O(1) list primitives (head, tail and cons). The data structure uses a collection of complete binary trees as the representation of the list. Access is through preorder traversal, allowing O(1) head. And the number of trees is determined by a skew binary decomposition of integer n, for n being the size of the list. A skew binary decomposition is a collection of skew binary terms, where a skew binary term is of the form (2^k - 1). e.g. the skew binary decomposition of 43 is (1 + 1 + 3 + 7 + 31). If the list has 43 elements, we will represent it using a collection of 5 binary trees of sizes 1, 1, 3, 7 and 31.

For a random lookup, we first need to locate the tree, a log n operation and then search within the tree, another log n operation. Okasaki's paper has all the details of analysis. It is straightforward, yet elegant ..

Two implementations are given ..

the first one uses a list of tuples (List[(Int, Tree[T])]) to implement RAList[T]

object RAList {
  def empty[T]: RAList[T] = RAList.Nil

  private [immutable] case object Nil extends RAList[Nothing] {
    override def equals(that : Any) = this eq that.asInstanceOf[AnyRef]

  private [immutable] case class Root[+T](trees: List[(Int, Tree[T])])
    extends RAList[T] {


the second one uses a recursive data structure to implement the same. This is a better implementation than the first one and proved to be faster as well ..

object RandomAccessList {
  def empty[T]: RandomAccessList[T] = RandomAccessList.Nil

  private [immutable] case object Nil extends RandomAccessList[Nothing] {
    override def equals(that : Any) = this eq that.asInstanceOf[AnyRef]

  private [immutable] case class Root[+T](size: Int, tree: Tree[T], rest: RandomAccessList[T])
    extends RandomAccessList[T] {

I have setup a Google project, which I named pfds (Purely Functional Data Structures), containing both the implementations. The implementations come with some unit tests using specs. Everything is quite rudimentary at this stage, but I plan to enrich the project with more functional data structures .. Meanwhile you can browse through the source code, critique the implementation with suggestions to make them more Scalaesque ..


Justin Shacklette said...

You'll definitely want to check out the
Persistent Vector impl by Daniel Spiewak.

And also the new collections in Scala 2.7.2 (especially IntMap) by David MacIver.

Debasish said...

Sure .. I have looked at both of them. IntMap in Scala stdlib is based on Okasaki's Integer Map implementation, a beautiful functional data structure. Daniel's PersistentVector is adapted from Rich Hickey's Clojure and is yet another functional data structure that offers persistence.
Recently with the upsurge of interest in functional programming we are looking at more and more functional data structures wrt the advantages that they offer as compared to imperative data structures, the most important ones being baked-in-concurrency and full persistence.