Sunday, December 23, 2007

The Yegge Effect

So Steve Yegge has come up with another big post, and, as usual, the blogosphere is up on their heels. For mere mortal bloggers, it is like planning a movie release for not-so-big producers - your blog post is likely to be swamped in the Yegge Effect if the timing of your post happens to clash with that of Mr. Yegge. This post is a reaction to the Yegge Effect. Irrespective of whether you agree to all of his rants or not, a Yegge post has an impact, which you cannot ignore. You may call it the Yegge Impact as well.

Anyway, on a more serious note, it was a great post from Steve Yegge. It has brought to the forefront a number of software engineering issues - both technical and cultural. My greatest take from the post has been the preaching to select the engineering tools (read the language of development) based only on the job at hand. In fact, in one of the responses to the comments, Steve mentions that his selection of a dynamically typed language for rewriting Wyvern has been driven by the fact that the system he is trying to model is an extremely dynamic beast and in the earlier Java based system he needed all sorts of twisting and subversion of the type system to achieve the level of dynamism that the software demanded. And he chose Rhino over Groovy and Scala simply because the latter have not yet been mature enough and he was looking for something that has been enough time tested.

Yet Another Java Bashing ?

Expectedly enough, many of the bloggers have interpreted Yegge's post as another of those Java bashings. Admittedly there has been some unfair criticism of Java and design patterns in his post. We cannot ignore the fact that Java is verbose, and lacks the expressiveness of many of today's programming languages. And design patterns, very often, do not help in making your code succinct or concise, irrespective of the language in which you implement them. The biggest contribution of design patterns in statically typed implementations is the uniformity of vocabulary that they espouse - the moment you talk about factories, people understand creational semantics, the moment you talk about strategies, developers understand granular variations of algorithims within a larger context. Many of today's languages, especially the dynamically typed functional ones, have most of the 23 GOF patterns built-in the language themselves. To them, evolution of these patterns may sound like ways to get around the deficiencies of languages like Java and C++.

I feel Java has been able to successfully position itself as a much better C++. Accidental complexities of memmory management and pointer manipulations have been successfully abstracted away in Java - a definite step forward in expressiveness. And this is possibly the single most important reason that Java is ruling the enterprise even today. Java maintained the curly braces look and feel of C++ - yet another reason it could successfully launch itself within the cultural embodiment of C++ programmers. Raganwald is correct when he talks about Yegge's post as addressing more of cultural issues than technical ones. It is the Java culture that dominates the enterprise today. It does not require to be a strong elitist to program at a higher level of abstraction in Java - only you need to embrace the implementation of these features in the future versions of the language. Unfortunately many people are still against the empowerment of Java programmers with the new tools to write better programs. Steve Yegge mentions about copy-paste in Java causing code bloats. Many of the bloats are due to the fact that Java lacks the expressiveness of languages like Scala, but many of them are also due to an averagist mentality towards designing programs in Java. People have grown to develop a cultural apathy towards embracing new paradigms, incorporating functional features within the language, in the fear that it will make Java too complicated for the average Wall Street programmer. Until we come out of this vicious circle of apprehension, Java programs will continue to be great big machines that can move the dirt this way and that.

The question is .. Is Java too broken to support such empowerment ?

Tools of the Trade

Responding to a user comment, Steve says :
If I were writing a compiler, I'd use an H-M language (probably Scala). This game is a different beast; it's one of the most "living" systems in the world (notwithstanding its 1-year hibernation). You could do it in an H-M system, but it would be nearly as big as the Java system.

Select your programming language based on the task at hand. If you need well defined contracts that will be used by a large programmer base, then static typing is the answer. And Scala is one of the most advanced and expressive statically typed languages on the JVM today. In case Java decides not to implement any of the features that have been proposed by so many experts, then possibly Scala will enter into the same footsteps to displace Java that Java could successfully do to C++ a decade ago. Natural evolution !

The post also has a healthy bias towards dynamically typed systems, with all the code compression benefits that they offer through succinct abstractions. No complaints on that, though given today's constraints in executing large enterprise projects, I am still inclined to use the safety net that powerful static typing has to offer. One of the masters and researchers of programming languages, Matthias Felleisen makes the following observation in the comments to Yegge's post, while talking about evolutionary programming and gradual typing in ES4:
As a co-creator of gradual transformations from dynamically typed to statically typed languages (see DLS 2006), I am perfectly aware of this idea. The problem is that Ecmascript's implementation of the idea is broken, unsound. So whatever advantages you would get from a sound system, such as ML's or even Java's, you won't get from ES 4. (The most you get is C's notion of types.)

X Programmer (for all X)
But you should take anything a "Java programmer" tells you with a hefty grain of salt, because an "X programmer", for any value of X, is a weak player. You have to cross-train to be a decent athlete these days. Programmers need to be fluent in multiple languages with fundamentally different "character" before they can make truly informed design decisions.

A single language programmer always suffers from a myopic view of designing things. Unless you are exposed to other paradigms of programming, anonymous inner classes will always appear to be the most suitable idiom for the problem at hand. Many powerful programming techniques offered by the next door programming language may be more suited to solving your problem. Explore them, learn a new language and learn to think in that new language. This is a great message from Steve Yegge. Sometime back I was looking at an interview with Joe Armstrong, the inventor of Erlang. He believes that the combination of OCamL, Erlang and Haskell has the power to solve most of the problems of today's enterprise. OCamL, with all the static typing can be the C of tomorrow, the zen language for building the virtual machines, Erlang, with its dynamic typing, can support distributed applications with changing code on-the-fly, not really a cup of tea for constant type analyses, while Haskell can be great for building business applications. The only teaching that is required is to appease the user community of the unwarranted apprehensions of functional programming. Functional programming is based on the lambda calculus - it is the sheer intellectual weight of this very basic dictum that drives today's Java programmers away from the paradigm. Joe believes that functional programming experts should come forward more and write easy-to-read books to dispel this demon out of programmer's mind. Joe's book on Programming Erlang is a great piece - go grab a copy and do yourself a favor of learning a new functional language. You may not use FP in your next enterprise project, but you will surely be enriched with powerful idioms that higher order functions, pattern matching and hylomorphisms have to offer.

and then the central thesis ..

Refactoring makes the codebase larger that the IDEs of today are simply not capable of handling. Steve mentions that his codebase of 500K LOC could not be managed by Eclipse. I do not know for sure about the organization of the codebase of Wyvern, but I have seen Eclipse handle huge codebases. Only catch is that you need to manage them as multiple separate individual projects / artifacts with clean dependencies specified between them. And you need not have all of them open and active at the same time. Refactoring may increase the physical size of your codebase - refactoring away creational statements with pluggable factories through a Dependency Injection framework will cost some megabytes in terms of the ear size. But, I think the intellectual size of the codebase reduces - you now no longer need to worry about the creation processes and lifecycle management of your objects. Steve, however, feels that DI is also a bloat enhancer in a project. If I am using a statically typed language like Java, I think DI is a useful technique for implementing separation of concerns and modularization of your codebase. But I agree with Steve that platforms like Javascript, Ruby and Python do not need DI. They are too dynamic and offers enough flexibility to do away with the overhead of a DI framework. And regarding size bloat being reduced by dynamically typed languages, don't you think that it also increases the intellectual size of the codebase ?

So, that, in short was the Yegge Impact on me ..


Anonymous said...

Who the hell is Steve Yegge? If he is so important why is this the first time I've heard of him?

Anonymous said...

Don't know this Yeggie person, but I think all this Java bashing is disappointing.

Anonymous said...

I'm sick of people quoting Yegge, as if he has ever made any insightful comments. This last post is how he wrote a basic Java program, at 500 KLOC, which is *bigger* than Quake. Honestly, it is not the language which is verbose, but he is. His posts are full of ramblings that never get to a point. From the sound of this project, it seems like his code is written in the exact same way.

He needs to stop bashing languages and start learning how to program. The only one at fault is himself, but he would rather whine than put an ounce of effort to become compitent.

I'm saying this as someone who enjoys exporing languages. But writing a 1984-esque game in 500KLOC, rather than the 8KLOC it would have taken at the time, shows that he is simply incompitent.

Unknown said...

ha! ha! Steve Yegge has at least stirred up the blogosphere with his observations. Whether u agree to him or not is a different thing. But this is the Yegge effect as I have mentioned in the post.

Anyway, I am a huge fan of his rants and have been reading them since his Amazon days. His drunken rants are really very insightful and contains loads of technical content.

Regarding his migration of the entire 500K beast to Rhino is a decision that he himself has taken. Had I been in his situation, I would not have decided to go for a big bang migration of entire code base to something as evolving as Rhino. Wyvern definitely has lots of dynamic stuff which are more meaningfully modeled in a dynamically typed language. Possibly I would have identified those areas and targetted their migration to a dynamic language, keeping the base in Java itself. Verbose, as it is, Java, of course, has lots of advantages, static typing being one of them, particularly for such a large application. I am not sure if static typing of Java is one of the better implementations. Personally I think it is too verbose, lacks type inferencing etc. Scala, Haskell, OcamL provide better implementations of static typing. But once u have a large base with Java, I would have settled with it for the time being and plan for a careful migration in an evolutionary way. But hey! It is his code base - he is the best person to call the shots.