Monday, October 27, 2008

Data 2.0 - Is your Application ready ?

Brian Aker talking about assumptions on Drizzle and future of database technologies ..

"Map/Reduce will kill every traditional data warehousing vendor in the market. Those who adapt to it as a design/deployment pattern will survive, the rest won't. Database systems that have no concept of being multiple node are pretty much dead. If there is no scale out story, then there is not future going forward."

Later on in the same post he goes on to mention the forces that he hopes would drive the performance of tomorrow's database technologies - map/reduce and asynchronous queues. Drizzle is a fork out of MySql, it is revolutionary in many respects compared to all other forks of the same product or similar products of the same genre. Drizzle does away with the erstwhile dodos of database technology viz. stored procedures, triggers etc. and is being planned exclusively for scaling out in the cloud. The development model is also ultra open source, aka organic open source and is being driven by the community as a whole.

Drizzle is clearly a step towards N > 1 and quite a forceful step too. Erlang was the silent roadmaker towards true N > 1 with the entire ecosystem of OTP, mnesia and the supervisor hierarchies .. Erlang offers the platform and the correct set of primitives for delivering fault tolerance in applications along with replicated mnesia storage. Erlang was not built for N = 1 .. it had N > 1 right into it and right from day 0 ..

Couchdb is another instance that may hit the sweet spot of getting the combination right - asynchronous map/reduce along with replicated, distributed, loosely coupled document storage. Implements REST, BASE, MVCC .. what else .. that leads to eventual consistency of the system.

All the buzz about future database technologies have been somehow related to the cloud, or at least being planned keeping the scale out factor in mind. Erlang started it all, and is now being actively driven in multiple dimensions by all similar complementary/competing technologies and platforms. No one talks about normalization or ACID or multi-phase commit these days. Call it Web 2.0 or Enterprise 2.0 or social networking, everything boils down to how enterprise IT can make data access easier, better, and more resilient to datacenter or network failures without compromising on the quality of service.

Middleware matters, data storage in the cloud matters, data processing in the cloud matters, and we are looking at lots of vendors fighting for a space there in. Applications of Enterprise 2.0 need to be smart and malleable enough to participate in this ecosystem. Make sure they are implemented as loosely coupled components that talk to middleware services asynchronously, can consume the atom feeds that other enterprise applications generate and do not depend on ACID properties or rely on synchronous 2-phase commit protocols from the data layer. Are we there yet ?

3 comments:

Stephan.Schmidt said...

CouchDB should have a better working Java API (Therefor Scala API).

http://stephan.reposita.org/archives/2008/10/24/scalaris/

CouchDB should better support XML and be data format agnostic.

(E4X and CouchDB results in some early hits).

Peace
-stephan

Anonymous said...

" Drizzle does away with the erstwhile dodos of database technology viz. stored procedures, triggers etc. "

IOW a MySQL developer wants to remove all those features that he doesn't understand and only reluctantly had added to MySQL.

This is retrenchment, but let him create a fork and try his best.

Anonymous said...

"IOW a MySQL developer wants to remove all those features that he doesn't understand and only reluctantly had added to MySQL."

Or... maybe he understands that stored procedures are about data locality in regards to execution and can be done through modern application frameworks.

Or... maybe he understands how the prelocking system works with Triggers, and how they bypass foreign key contraints.

Or... maybe he knows that view which are materialized in most cases are worthless.