Tuesday, October 02, 2007

Refactoring - Only for Boilerplates ?

There are some bloggers who make you think, even if you do subscribe to an orthogonal view of the world. Needless to say, Steve Yegge is one of them. Quite some time back he introduced us to Fowler's Refactoring bible through this extremely thought provoking essay. Even if you are a diehard Java fan, his post forces you to think about the Caterpillar Butterfly conundrum. Seven months later, another great blogger, Raganwald, builds upon Yegge's post and discusses the various agonies that mutable local variables bring into your way of Extract Method refactoring. Raganwald's post led me thinking for the second time when I first read it. Last weekend, I came back to both of the links accidentally, through my favorite search engine (as if there are others also!) and re-read both of them. This post is an involuntary rant of that weekend reading.

Yegge says ..
Automated code-refactoring tools work on caterpillar-like code. You have some big set of entities — objects, methods, names, anything patterned. All nearly identical. You have to change them all in a coordinated way, like a caterpillar’s crawl, moving all the legs or lines this way or that.

Raganwald ends his post with the scriptum ..
This is exactly why languages with more powerful abstractions are more important than adding push-button variable renaming to less powerful languages. Where's the button that refactors a method with lots of mutable variables into one with no mutable variables? No button? Okay, until you invent one, don't you want a language that makes it easy to write methods and functions without mutable local variables?

Jokes apart, sentiments on the wayside, both of them target Java as the language of the Blub programmers, a language with less powerful abstractions, a language that generates caterpillar like code for push-button refactoring. They make you think deeply the possible reasons why you don't come across the term refactoring as often in other *powerful* languages like Lisp and Ruby. While you as a Java programmer consider yourself agile, when the menu item Refactor drops down into your favorite IDE and renames a public method correctly and with complete elan.

Is Refactoring only for Java ?

Martin Fowler defines refactoring as the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure. We all want to do so on the software that we write - correct ? Sure, then why the heck, the Java guys boast on something that should, in principle, be the normal way of life irrespective of the language that you use ? Can it be the case that with other languages, programmers write the best optimized code the very first time and need not improve upon the design and code organization in future ? Too optimistic a claim even for Paul Graham.

OO Organization and Modularization

Java is an OO language and offers a wealth of various ways to organize your modules and subsystems. Being a class based language, Java offers various relationships to be established between your abstractions and organize them through polymorphic hierarchies. You can have inheritance of interface as well as implementation, containment and delegation and a slew of frameworks to work on various instantiation models and factories. You can have flexible packaging at the development level as well as deployment level. Java compiles into bytecode and runs in the most powerful VM on the planet - apart from pure Java you can drop into your codebase snippets of other scripting languages that can reside and run within the JVM. The moot point is that all these flexibilities provide you, the Java programmer, with a slew of options at every stage of design and development. As client's requirements change, as your codebase evolves, it is only natural that you optimize your code organization using the best possible modularization strategy. Does this establish refactoring as a necessity for well-designed software and not a mere tool ? More so, when you are using a programming language that offers a rich repertoire of code and module organization. Sure, you need to promote some state into an inner class increasing the level of abstraction, or refactor a slice of code snippet from an existing method into another reusable function.

And Automated Refactoring Tools ?

So, we now agree that refactoring is necessary and it leads us to the holy grail of well-designed software. Do we need automated refactoring tools ? May not be, if we are working on a small set of codebase which you can cache into your memory all at a time. The moment you start having page faults, you feel the necessity of automation. And typical Java enterprise systems are bound to cross the limits of you memory segment and lead to continuous thrashing. Obviously not something that you would want.

But what do you need to have automated refactoring capabilities into your IDE ? Type information, which unfortunately is missing from code written in most dynamic languages. Without type information, it is impossible to have automated full-proof refactoring. Cedric has a nice post which drums this topic to the fullest.

In short, with Java's rich platter of code organization policies, you have the flexibility of merciless refactoring and with Java's static type system you have the option of plugging in automated refactoring tools with your IDE - so nice.

What about Mutable Local Variables ?

It is a known fact that mutable variables, at any level, are an antipattern for functional programming. And a functional program makes it a mathematician's dream for all sorts of analyses. However, in Java, we are programming real world problems, which has enough of a state to model. And Java is a language which offers the power of assignment and mutability. Assignments are not always bad, and is a natural idiom for programming imperative languages. Try modeling every real world problem with a stack containing objects with nested life times and with a constant value during their entire life time. Dijkstra has the following observation while talking about the virtues of assignments and goto statements ..
.. the only way to store a newly formed result is by putting it on top of the stack; we have no way of expressing that an earlier value becomes now obsolete and the latter's life time will be prolonged, although void of interest. Summing up: it is elegant but inadequate. A second objection --which is probably a direct consequence of the first one-- is that such programs become after a certain, quickly attained degree of nesting, terribly hard to read.

It is all but natural that mutable local variables make automated refactoring of methods difficult. But, to me, the more important point is the locality of reference for those local variables. It all depends on the way the mutation is used and the closure of code that is affected by the mutation. No process can handle ill-designed code - disciplined usage of mutability on local variables can be handled quite well by the refactoring process. It all boils down to the problem of how well your code is organized and the criteria used in decomposing systems into modules (reference Parnas). After all, we are modeling state changes using mutable local variables - Raganwald suggests a coarse level state management using objects. This is what the State Design Pattern gives us. He mentions in one of his comments that Object oriented programming is, at its heart, all about state. if you write objects without state, you are basically using objects as namespaces. So true. But at the same time when you need to handle state changes through mutability, always apply them at the lowest level of abstraction, so that even if you need to synchronize for concurrent access, you do so by locking at the minimum level of granularity. So, why not mutable local variables, as long as the closure is well understood and justified by the domain for which it is being applied.

Mutable local variables, when inappropriately weaved into the sphagetti of a method, makes refactoring very difficult. At the same time when the programmer discovers this difficulty, she can try to redesign her method taking advantage of the numerous levels of abstraction that Java offers. From this point of view, Refactoring is also a great teacher towards the ultimate objective of well-designed software.

6 comments:

Carsten said...

In the essence it is like that: As your program contains some repetitive element that could be refactoring, you are not using the appropiate abstractions (or the "wrong" programming language)

I cannot relate to this, this would mean that the best possible implem,entaion would be a monster of 20k-20Mloc which contains no inner structure to give you a foothold for analysis.

I prefer having 5 times more code with a bit redundance that I can manage mostly with an IDE. This seems more reasonable to me: Have a look at the redundancies in natural langauges, they are there for a good reason - this reason has been verified by thje only reliable test available: The test of time

Dan Nugent said...

I think that a lot of programmers are ignoring an important point when people talk about reducing code repetition on large projects.

Part of the idea is that large projects are intrinsically *wrong*. That you should be looking at making a number of smaller projects that are composable, even if you never end up reusing one of those smaller projects elsewhere.

The *only* counterargument to better modularization and smaller projects that I know of is optimization. But common, when you're going towards optimization as one of your primary project goals you already have to acknowledge that you're entering a whole new world of pain

Anonymous said...

Java is an OO language and offers a wealth of various ways to organize your modules and subsystems.

Even ignoring that the first half of this statement pretty much precludes the second, Java is not particularly wealthy in this regard even in comparison to other recent OO languages.

But what do you need to have automated refactoring capabilities into your IDE ? Type information

Absolutely! But again, Java's typing is pretty weak for a statically-typed language of it's era. On the plus side, the tools for Java are unsurpassed.

Dijkstra: .. the only way to store a newly formed result is by putting it on top of the stack; we have no way of expressing that an earlier value becomes now obsolete and the latter's life time will be prolonged, although void of interest.

As I read it, this is purely a performance concern, right? Thankfully these days we have tail calls and garbage collection.

Dijkstra: A second objection --which is probably a direct consequence of the first one-- is that such programs become after a certain, quickly attained degree of nesting, terribly hard to read.

By which he justifies the introduction of goto (necessarily coupled with assignment - aka mutable variables), then questions whether it really helped. He advocates replacing goto with "more powerful notations", which don't hinge so much on the use of assignment. By now it is almost universally accepted that he was right about goto, but we're only slowly noticing that assignment is still hanging around like a bad smell.

After all, we are modeling state changes using mutable local variables

No! That's precisely it. Local means disappearing at function exit. There is no local change of state, so it is not necessary to mutate local variables.
There is only external change of state which is already tracked externally. Modelling external state-change with local variables is merely caching - a performance hack only.

Because they increase coupling, mutable local variables should only be used when necessary as a performance hack, and they should be marked such.

Perhaps I phrase some of this too strongly, but I find it so very frustrating that people can read all these well-considered, well-argued and well-written texts and come away without understanding their fundamental points.

Debasish said...

@Anonymous:
I am not advocating usage of mutable local variables. What I meant was that so long as the mutation is done with a purpose and the closure clearly defined within the codebase, the Extract Method refactoring can be done easily. So, the main issue in refactoring is NOT the mutable local variables, but the code organization itself. And very often we find situations which can be more readable using assignments.

Anonymous said...

情趣用品,情趣用品,情趣用品,情趣用品,情趣,情趣,情趣,情趣,按摩棒,震動按摩棒,微調按摩棒,情趣按摩棒,逼真按摩棒,G點,跳蛋,跳蛋,跳蛋,性感內衣,飛機杯,充氣娃娃,情趣娃娃,角色扮演,性感睡衣,SM,潤滑液,威而柔,香水,精油,芳香精油,自慰套,自慰,性感吊帶襪,吊帶襪,情趣用品加盟AIO交友愛情館,情人歡愉用品,美女視訊,情色交友,視訊交友,辣妹視訊,美女交友,嘟嘟成人網,成人網站,A片,A片下載,免費A片,免費A片下載愛情公寓,情色,舊情人,情色貼圖,情色文學,情色交友,色情聊天室,色情小說,一葉情貼圖片區,情色小說,色情,色情遊戲,情色視訊,情色電影,aio交友愛情館,色情a片,一夜情,辣妹視訊,視訊聊天室,免費視訊聊天,免費視訊,視訊,視訊美女,美女視訊,視訊交友,視訊聊天,免費視訊聊天室,情人視訊網,影音視訊聊天室,視訊交友90739,成人影片,成人交友,美女交友,微風成人,嘟嘟成人網,成人貼圖,成人電影,A片,豆豆聊天室,聊天室,UT聊天室,尋夢園聊天室,男同志聊天室,UT男同志聊天室,聊天室尋夢園,080聊天室,080苗栗人聊天室,6K聊天室,女同志聊天室,小高聊天室,上班族聊天室,080中部人聊天室,同志聊天室,聊天室交友,中部人聊天室,成人聊天室,一夜情聊天室,情色聊天室,寄情築園小遊戲情境坊歡愉用品,情境坊歡愉用品,情趣用品,成人網站,情人節禮物,情人節,AIO交友愛情館,情色,情色貼圖,情色文學,情色交友,色情聊天室,色情小說,七夕情人節,色情,情色電影,色情網站,辣妹視訊,視訊聊天室,情色視訊,免費視訊聊天,美女視訊,視訊美女,美女交友,美女,情色交友,成人交友,自拍,本土自拍,情人視訊網,視訊交友90739,生日禮物,情色論壇,正妹牆,免費A片下載,AV女優,成人影片,色情A片,成人論壇,情趣,免費成人影片,成人電影,成人影城,愛情公寓,成人影片,保險套,舊情人,微風成人,成人,成人遊戲,成人光碟,色情遊戲,跳蛋,按摩棒,一夜情,男同志聊天室,肛交,口交,性交,援交,免費視訊交友,視訊交友,一葉情貼圖片區,性愛,視訊,視訊聊天,A片,A片下載,免費A片,嘟嘟成人網,寄情築園小遊戲,女同志聊天室,免費視訊聊天室,一夜情聊天室,聊天室

Unknown said...

You post some great stuff Debasish. Thanks!