Thursday, May 10, 2007

XML - Not for Human Consumption

I have never been a big fan of using XML as a language for human interaction. I always felt that XML is too noisy for human comprehension and never a pleasing sight for your eyes. In an earlier post I had pitched in with the idea of using Lisp s-expressions as executable XML instead of the plethora of angle brackets polluting your comprehension power. XML is meant for machine interpretation, the hierarchical processing of XML documents gives you the raw power when you are programming for the machine. But, after all, Programs must be written for people to read, and only incidentally for machines to execute (from SICP).

With the emergence of the modern stream of scripting languages, XML is definitely taking a backseat in handling issues like software builds and configuration management. People are no longer amused with the XML hell of Maven and have been desperately looking for alternatives. Buildr, a drop-in replacement for Maven uses Ruby (inspired by Rake), SCons uses Python scripts, Groovy has also been used in the configuration space for Jini-based ComputeCycles project (via Artima). XML has also started looking like just yet another option for configuring your beans in the Spring community. We are just spoilt with other options of configuration - Spring JavaConfig uses Java configuration, Springy does it with JRuby, Groovy SpringBuilder does it with Groovy.

Configuration and builds of systems are nontrivial activities and need a mix of both declarative and procedural programming. Using XML as the brute force approach towards this has resulted in complex hierarchical structures, too complex for human comprehension. A perfect example in hand is Maven which attempts to do too many things with XML. The result is extremely complex XML hell much to the anguish of developers and build managers. A big build system configured using Maven becomes a maintenance nightmare in no time.

People are getting increasingly used to the comforts of DSLs and configurations of large systems have never been too trivial an artifact. Hence it is logical that we would like to have something which provides a cool humane interface. And XML is definitely not one of them ..

13 comments:

Anonymous said...

But XML is not about programs, it is about data. The alternative would be binary file formats and even worse things.

Danny Yoo said...

Programs are data!

I think the point is that XML as a language for describing build configurations is bad because it's much more verbose than the alternatives.

This has nothing to do with binary vs. text formats: it has everything to do with the suitability of different text-based languages in writing program data.

As another concrete instance of an XML-based language that doesn't seem ideal, see XACML. (http://sunxacml.sourceforge.net/guide.html)

The code to express the first policy seems trivially verbose, to the point that it seems to punish the user with stuff that has nothing to do with expressing security policies.

Anonymous said...

False choice - there are other ways to represent binary other than XML and binary (including the s-exprs the author mentioned).

Winterstream said...

I cannot say anything about Maven, since I've never worked with it.

I don't think I've ever worked with a build system I liked. I really hate Make, though I've had to learn its black magic.

So, as much as I agree with you on XML issues, I must add that it seems hard to make a decent build system.

The one thing XML has going for it, is ubiquity. In this way, it's quite similar to PHP (this reminds me of an article I read on Reddit, earlier today). I hate both; but damn, it's just so useful that they're everywhere.

At least one has the option of encoding data using s-expressions and only at the last minute converting the data to XML. Things are a bit more hairy with PHP.

Unfortunately, our "pragmatic" cousins are wedded to XML. You know, the "serious" programmers who write "serious, enterprise" apps (yeah, those boring blokes who used COBOL back in the day and Java or C# now).

XML would be less painful if we get some pattern matching support in languages akin to the pattern matching found in functional languages.

Imagine if one could specify a schema, which the compiler would type check in matching expressions.

I know, I know, it's ugly. But I'm not optimistic enough to believe that good sense will triumph. There are too many SPIN (Software Process Improvement Network) devotees out there to let beauty and elegance triumph.

We're going to be stuck with XML for ages to come, so let's add some syntactic sugar to make the bitter load bearable.

Unknown said...

I don't mind writing programs that read xml, and write xml provided that theres a reason. And usually I utilize a library that saves me the trouble of parsing it. So XML has really delivered greatly in that regard and has been a success.

But writing XML as a programming language still drives me crazy. Developers who decide to use XML for configuration (without a tool to build it) or use it as a scripting language are insane. It's not a friendly language to write in.

The same goes for HTML. But thats too entrenched to change at this point, and theres enough libraries that facilitate it if one so desires.

Unknown said...

The thread has been picked up at reddit .. enjoy the thread ..

Unknown said...

[for wynand: ]
Have a look at Rake or Buildr. Both of them are based on DSLs and much friendlier than make or XML based systems.

Anonymous said...

If the answer is LISP you are asking the wrong question.

Unknown said...

Uhh...

<person>
<name>Mike</name>
<sex>Male</sex>
<age>22</age>
</person>

is a lot more readable (for a person) than

Mike,m,22

Unknown said...

[for mike: ]
How about :
(person
(name Mike)
(sex Male)
(age 22))
for readability ?
Have a look at my earlier post on this subject of readability.

Anonymous said...

S-expressions are the best way to express data to be used by code. For the simple fact that the language expressing the data would be the exact same one as handling the code. No need for a parser, and you'd get WAY more power than you could ever get with XML (my point is, since the data IS code, you can apply methods to it, or do whatever you want DIRECTLY, with no parsing).

But, religion, insecutiry, management and plain stupidity will make sure you keep doing XML for years to come.

Unknown said...

[for anonymous: ]
ah! the joy of program-as-data! can't agree more. After years of programming in Java and C++, I can't imagine I am saying this!

Unknown said...

What about JSON ? you can work with it directly in JavaScript or Python and there are libraries out there for a wide variety of languages. It is simple enough to read and understand. I am of course talking about it in the "data is code" context, because with JSON, you can directly manipulate the data using JavaScript "eval".

http://www.json.org/