Tuesday, January 24, 2012

List Algebras and the fixpoint combinator Mu

In my last post on recursive types and fixed point combinator, we saw how the type equations of the form a = F(a), where F is the type constructor have solutions of the form Mu a . F where Mu is the fixed point combinator. Substituting the solution in the original equation, we get ..

Mu a . F = F {Mu a . F / a}

where the rhs indicates substitution of all free a's in F by Mu a . F.

Using this we also got the type equation for ListInt as ..

ListInt = Mu a . Unit + Int x a

In this post we view the same problem from a category theory point of view. This post assumes understanding of quite a bit of category theory concepts. If you are unfamiliar with any of them you can refer to some basic text on the subject.

We start with the definition of ListInt as in the earlier post ..

// nil takes no arguments and returns a List data type
nil : 1 -> ListInt

// cons takes 2 arguments and returns a List data type
cons : (Int x ListInt) -> ListInt


Combining the two functions above, we get a single function as ..

in = [nil, cons] : 1 + (Int x ListInt) -> ListInt

We can say that this forms an algebra of the functor F(X) = 1 + (Int x X). Let's represent this algebra by (Mu F, in) or (Mu F, [nil, cons]), where Mu F is ListInt in the above combined function.

As a next step we show that the algebra (Mu F, [nil, cons]) forms an initial algebra representing the data type of Lists over a given set of integers. Here we are dealing with lists of integers though the same result can be shown for lists of any type A.

In order to show (Mu F, [nil cons]) form an initial F-algebra we consider an arbitrary F-algebra (C, phi), where phi is an arrow out of the sum type given by :

C : 1 -> C
h : (Int x C) -> C


and the join given by [c, h] : 1 + (Int x C) -> C

By definition, if (Mu F, [nil, cons]) has to form an initial F-algebra, then for any arbitrary F-algebra (C, phi) in that category, we need to find a function f: Mu F -> C which is a homomorphism and it should be unique. So for the algebra [c, h] the following diagram must commute ..
which means we must have a unique solution to the following 2 equations ..

f o nil = c
f o cons = h o (id x f)


From the universal property of initial F-algebras it's easy to see that this system of equations has a unique solution which is fold(c, h). It's the catamorphism represented by ..

f: {[c, h]}: ListInt -> C

This proves that (Mu F, [nil, cons]) is an initial F-algebra over the endofunctor F(X) = 1 + (Int x X). And it can be shown that an initial algebra in: F (Mu F) -> Mu F is an isomorphism and the carrier of the initial algebra is (upto isomorphism) a fixed point of the functor. Well, that may sound a bit of a mouthful. But we can discuss this in more details in one of my subsequent posts. There's a well established lemma due to Lambek that proves this. I can't do it in this blog post, since it needs some more prerequisites to be established beforehand which would make this post a bit bloated. But it's really a fascinating proof and I promise to take this up in one of my upcoming posts. Also we will see many properties of initial algebras and how they can be combined to define many of the properties of recursive data types in a purely algebraic way.

As I promised in my last post, here we have seen the other side of Mu - we started with the list definition, showed that it forms an initial algebra over the endofunctor F(X) = 1 + (Int x X) and arrived at the same conclusion that Mu F is a fixed point. Or Mu is the fixed point combinator.

3 comments:

Anonymous said...

Debashish:

Sorry if this is a bit off-topic, and sorry for appearing (or, more honestly, being) a bit clueless:

I enjoy your blog -- even if I don't always understand it. I've been trying to get into FP a bit, to understand it more (thanks for the list of resources, BTW -- haven't started reading them yet, but hope to, soon) -- but the big question which keeps hitting over the head is: "Why?"

In this case (reading back to what seems to be the seminal post in this series -- on bananas, lenses, and "fixpoints") -- is this just an interesting puzzle-solving exercise, or is there some kind of real-world applicability? Would you envision using Mu (for example) in an application -- or would you consider such an approach basically unmaintainable by ordinary programmers?

(Forgive me: I expect, if I understood the subject more, I would know the answer to that -- but I find myself in a loop: having to put in lots of reading and effort to understand what people are saying, while not quite understanding *why* I should care -- which I expect I'm supposed to learn after I've deciphered the discussion.)

FP is, I admit, a bit of mystery to me. Yes, I get the whole point of immutability, lack of side-effects, etc -- but my eyes often glaze over when I see what seems to be a simple concept (say, EIP's word counter) implemented in a manner which, I'm pretty sure almost no-one I work with (or have ever worked with) would understand. I puzzle at the ongoing fascination with monads, and people doing (seemingly) hours of work rewriting an API to add a "print" statement to it -- or achieve some other mundane effect.

While I admire such mental prowess, I'm still scratching my head trying to understand the attraction -- and since you actually seem to do real-life work, and seem fairly practical -- I thought I'd just blurt out the question and get your take on it.

Regardless, thanks in advance. - Tim

Debasish Ghosh said...

Tim:

First I apologize if the post didn't make complete sense to you. At the same time I must admit that the content of the post may not make sense (a) at the first reading to the uninitiated and (b) if you don't care about the basic theory behind what you are implementing. It happened to me as well, when I first started digging deep into the subject.

For me it all started when I took up TAPL (Types and Programming Languages) by Benjamin Pierce. If you are a regular reader of my blog, you must be aware that one of the things that I get interest in is *types*. I like typed programming, implementation of type systems particularly with respect to functional programming languages. And there's hardly a better introduction to that than TAPL.

Assuming that you have interest in knowing some amount of type theory, one related subject that comes up together with it is category theory. You really can't learn type theory w/o a knowledge of category theory.

But let's see why we need to understand fixpoints. All of us have written recursive functions. But how do you write an anonymous recursive function ? You need to know about fixpoints for this. Fixpoints can be defined for untyped lambda calculus as well as for typed lambda calculus. Fixpoints can be defined at the value level (Y) or at the type level (Mu). Mu is the fixpoint (or more precisely, the least fixpoint) for types and we need it to understand the concept of recursive types (e.g. List, Tree etc.). But why do we care about Mu ? We can use java.util.List day in and day out without even caring about what Mu gives. Sure we can. But again, we frequently hear that a recursive data type like a Tree is also known as *algebraic data type*. Why ? What's the *algebra* behind this data type ? If you seek answers to these questions, then you will be led to the world of categories, algebras and morphisms.

As I showed through the last 2 posts on Mu, we can approach the subject from 2 sides - (a) type theory where we define it as a fixpoint and (b) from category theory where we show the algebra that defines Mu. The beauty of the 2 points of view is the unification through the definition of isomorphism between the two.

I hope I have been able to shed some light towards the usefulness of these concepts. But again feel free to ignore at your discretion. You may not need all these at your day job. I also don't need them for the work which I do from 9 to 5.

Anonymous said...

Debashish: Thought I'd done so earlier, but just wanted to say "thanks" for the answer and reading recommendation. Best to you!