Sunday, December 21, 2008

What is special about combinators in Joy

I am continuing my rendezvous with Joy, and this post is yet another rant about my perception of why combinators are real first class citizens in the paradigm of concatenative languages. Many of today's languages offer combinator libraries, but, with the possible exception of Haskell, none match the elegance that Joy offers.

Combinators help programming at a higher level of abstraction. In any functional language, using maps and folds definitely make your program more expressive than explicit recursion or iteration. Consider this snippet that uses map as a transformer over a list to convert every element of the list to upper case ..

upperCase xs = map toUpper xs

and here is the same using explicit recursion ..

upperCase (x:xs) = toUpper x : upperCase xs
upperCase []     = []

Apart from the former being more expressive, using the map combinator also helps you separate the strategy from the implementation of your program. map is a higher level abstraction and the implementation of map can be varied from recursive to iterative without any impact on the client code base. For example, without tail call optimization on the JVM, Scala implements as an iterative method as opposed to the recursive implementation of Erlang map.

So, always use combinators to program at a higher level of abstraction.

In Joy, programming with combinators is the most natural way of doing things. In case you are just wondering, why is it not so with the programming language of your choice, the reason is in the architecture of the language. The combinators which I implemented in Scala, in my last post, did the stuff, but do we see such usages in real life ? Even with languages like Ruby, which do not have the noise of type annotations, combinators like divide_and_conquer, are not in common usage. The only possible exception to this today is Haskell, which allows point free programming, is referentially transparent and allows algebraic program transformations much like using formal methods through the medium of programming. And combinators are the most natural idioms in concatenative languages like Joy.

Why is that ?

Combinators are most effective when you compose them from atomic programs and evolve towards the bigger whole. And the closer you can get towards algebraic transformation with your combinators, the more elegant the process of composition looks. Today's mainstream languages are based on application of functions on parameters, leading to complex evaluation rules. Formal parameters add to the noise of composition and make the process less elegant. In statically typed languages, you also have the type annotations that add to the complexity of composition. On the other hand, in Joy, every combinator has the same form (Stack -> Stack) and gets all parameters including quoted programs from the stack itself. Hence you can write elegant compositions like 1 [*] fold to define the product function over the elements of an aggregate. Today Haskell allows such point free style of programming (product in Haskell will be foldl (*) 1), but none of the other mainstream languages come even close. It will be interesting to compare Joy with Haskell with respect to rewriting capabilites and application for formal methods.

Perhaps, the most important feature in Joy that makes non-invasive composition of combinators work like a charm is the datatype of quoted programs. A quoted program is just another list that happens to be on top of the stack when a combinator expects it. And the list can very well be generated as a result of other combinator operations. It is this generalization that unifies all combinator operations. In other languages, combinators work on different abstractions, while in Joy, it is always the same thing - combinators get lists on top of stack (which are incidentally quoted programs), execute them and return a stack with the result. Hence map is a higher order function in other languages, while in Joy, map is still a unary first order function from stacks to stacks.

At the same time, quoted programs in Joy, allow the normal evaluation order semantics to be imposed during program execution, in an otherwise applicative model.

Here is an example ..

0  [dup * +]  fold

This is an example of using the fold combinator. The quoted program [dup * +] will be on the stack in passive form, and will only be active when we have the fold combinator being applied. The result is the sum of squares of the input aggregate.

Recursion Combinators

As mentioned at the beginning of the post, if explicit recursion makes programs hard to understand, how do we hide 'em ? Inside recursive combinators. But can we remove explicit recursion from individual combinators ? Sure .. we can, by introducing the venerable y-combinator. So, only the y-combinator will contain explicit recursion, and all other erstwhile recursive combinators will be implemented in terms of Y.

Let us visit the factorial example and find out for ourselves, how the y-combinator can be used to remove explicit recursion and what trade-off does it imply ..

Here is the explicitly recursive version of factorial in Joy ..

recursive-fac  ==  [0 =] [succ] [dup pred recursive-fac *] ifte

Just for the uninitiated, ifte is the if-then-else combinator in Joy that takes 3 program parameters - quite intuitively, the if-part, the then-part and the else part. Note how effectively, quotation has been used here to impose normal order of evaluation, aka call-by-name. Otherwise it would not have been possible to implement it at all.

and here is the version that uses the y-combinator ..

[ [pop 0 =] [pop succ] [[dup pred] dip i *] ifte ] y

where y is implemented as simply as ..

[dup cons] swap concat dup cons

and .. and .. which version do you think is more readable ? Of course the one that uses explicit recursion. Huh!

This is the problem with the y-combinator. It is too generic, all-purpose, and hence not enough descriptive. And when combinators are not self-descriptive, the readability goes for a toss. And this is yet another area where Joy shines ..

Joy has special purpose combinators for recursion that are independent of data types and enough descriptive depending on the structure of the recursion. Here is an example with the factorial computation ..

fac  == [null] [succ] [dup pred] [*] linrec

linrec is a recursion combinator that takes four parameters in addition to the data that it needs.

  • The first one is the if-part, where we specify a condition.

  • If the condition evaluates to true, then the second part (the then-part) gets executed and the combinator terminates.

  • Otherwise we look at the next 2 parameters - called the rec1-part and rec-2 part.

  • The rec1-part is executed and then the 4 program parameters are once again bundled together and the quotation form is immediately executed.

  • This is followed by the execution of the rec2-part.

The entire sequence models what we call a linear structural recursion.

For other variants of recursive problems, Joy offers a host of similar other combinators ..

  • binrec is for binary recursion, that makes problems like fibonacci computation or quicksort a trivial one liner

  • tailrec is for tail recursion, quite similar to linrec but without the rec2-part

  • primrec is for primitive recursion which offers suitable defaults for the if-part and the rec1-part

Then there are a host of other recursive combinators for more advanced recursions like mutual recursion, nested recursion, generic recursion with n-ary branching etc.


Daniel said...

I have programmed in Forth for quite a while, and I can understand the joy of Joy, pun intended. :-)

However, while it's quite easy to understand what the heck is going on while you are writing it, it's very tough getting back to it after a while (or maintaining someone else's code).

Also, while it's beautiful not having variables intruding the flow of code, it also mean you lose that tiny little documentation piece called variable name.

To make it worse, it introduces a type of error that's just not present on non-stack languages. You can actually pass the wrong data, if you for a second confuse what is being left on the stack.

And, just to top it all, it's a bitch to change if your requirements change.

Unknown said...

So true .. that's why I mentioned in an earlier post that Joy will give you a nice evening and a wonderful diversion from mainstream languages. Of course the fun of programming in a different paradigm comes as a bonus ..