Ruminations of a Programmer: September 2007

Friday, September 28, 2007

Domain Modeling with JPA - The Gotchas - Part 2 - The Invaluable Value Objects

In the first post of this gotcha series, I had discussed some of the issues around making entities publicly immutable, by not exposing direct setters to the layers above. This approach has its own set of advantages and offers a big safety net to the domain model. The domain model can then be manipulated only through the methods published by the domain contracts. While still on the subject of immutability of domain models, I thought I would discuss about the other cousin of immutable entities that plays a very big role in making your domain driven design more supple.

Enter Value Objects.

While an object-oriented domain model focuses on the behavior of entities, the relational persistence model manages object identities. And a successful marriage of the two paradigms is the job of a good ORM framework. But not all entities need to maintain their identities - their behaviors depend only upon the values they carry. Eric Evans calls them Value Objects.

Value objects are an integral part of any object oriented model, while they are somewhat obscure in the relational persistence model. It is a real challenge to have a successful representation of value objects as reusable abstractions in the OO domain model, while transparently storing them in the relational model with minimum invasiveness on part of the programmer. Value objects increase the reusability of the domain model and JPA offers a flexibile programming model to make their persistence transparent to the developer. The big advantages with value objects are that you need not manage their identities or their lifetimes - both of them are the same as the entities which own them.

Modeling a Value Object with JPA

Consider a sample model snippet where an Employee has-an Address - both of them are designed as separate domain objects in the model. After a careful analysis of the domain, we find that addresses are never shared, i.e. each employee will have a unique address. Hence the relational model becomes the following monolithic table structure :

create table employee (
  //..employee specific columns
  //..
  //..address specific columns
)

In the relational model, we need not have any identity for an address - hence it can be seamlessly glued into the employee record. While in the OO model, we need to have a fine grained abstraction for Address, since the purpose of the OO model is to have the most faithful representation of how the domain behaves. The Address class will have its own behavior, e.g. the format in which an Address gets printed depends upon the country of residence, and it makes no sense to club these behaviors within the Employee domain entity. Hence we model the class Address as a separate POJO.

// immutable
class Address {
  private String houseNumber;
  private String street;
  private String city;
  private String zip;
  private String country;

  //.. getters
  //.. no setter
  //.. behaviors
}

and an Employee has-an Address ..

class Employee {
  private Address homeAddress;
  //.. other attributes
}

JPA makes it really easy to have a successful combination of the two models in the above relationship. Just add an @Embedded annotation to the Address property in Employee class. This will do all the magic to make all individual address attributes as separate columns in the Employee table. And of course we can use all sorts of annotations like @AttributeOverride to change column names between the class and the table.

@Entity
class Employee {
  @Embedded
  @AttributeOverrides( {
    @AttributeOverride(name   =  "street",
        column = @Column(name = "home_street")),
    @AttributeOverride(name   =  "city",
          column = @Column(name = "home_city")),
    @AttributeOverride(name  =  "zip",
          column = @Column(name = "home_zip"))})
  private Address homeAddress;
  //.. other attributes
}

Modeling with JPA allows independent evolution of the OO domain model and relational persistence model. Don't ever try to enforce the relational paradigm into your domain - you are likely to end up in the swamps of the ActiveRecord modeling.

Collection of Value Objects

In the above example, the entity Employee has a one-to-one association with Address - hence it was easy to embed the address attributes as columns within the Employee table. How do we handle a one-to-many association between an entity and a value object ? Let us have a look at this scenario ..

A Project is an entity which abstracts an active project in a company. And the company raises Bills periodically to its clients for all the projects that it executes. The Bill object is a value object. We just have to raise bills and keep a record of all bills raised till date. A Bill does not have an identity, it's only the bill date and amount that matters. But we need to associate all bills with the project for which it is raised. This clearly warrants a 1..n association in the relational model as well. And the lifecycle of all bills is coupled to the lifecycle of the owning project. Sharing of bills is not allowed and we do not need to manage identities of every bill.

Using Hibernate specific annotations, here's how we can manage a set of value objects owned by an entity.

@Entity
class Project {
  //.. attributes

  @CollectionOfElements
  @JoinTable(name="project_bill",
    joinColumns = @JoinColumn(name="project_pk")
  )
  @AttributeOverrides( {
    @AttributeOverride(name = "billNo",
        column = @Column(name = "project_bill_no")),
    @AttributeOverride(name = "billDate",
      column = @Column(name = "project_bill_date")),
    @AttributeOverride(name = "raisedOn",
        column = @Column(name = "raised_on")),
    @AttributeOverride(name = "amount",
      column = @Column(name = "project_bill_amount"))}
  )
  @CollectionId(
    columns = @Column(name = "project_bill_pk"),
    type = @Type(type = "long"),
    generator = "sequence"
  )
  private Set<Bill> bills = new HashSet<Bill>();

  //..
  //..
}

Bill is not an entity - it is a simple POJO, which can be reused along with other owning entities as well. And if we want an inverse association as well, we can maintain a reference to the owning project within the Bill class.

@Embeddable
public class Bill {
  @Parent
  private Project project;
  //..
  //..
}

The database contains a table project_bill, which keeps all bills associated with a project indexed by project_pk. In case we need a sequencing of all bills, we can have a sequence generated in the project_bill table itself through the @org.hibernate.annotations.CollectionId annotation.

Value objects are an immensely useful abstraction. Analyse and find out as many value objects as you can in your domain model. And use the power of JPA and your ORM implementation to map them into your persistent model. The more value objects you can dig out, less will be the effort in managing identities and controlling lifetimes for each of them.

Decoupled Value Object Instantiation Models

There are some situations where value objects tend to be numerous in number. Here is an example :

Every employee has-a designation. Designation is a value object in our domain model and in a typical organization we have a limited number of designations. We make a separate abstraction for designation, since a designation has other behaviors associated with it e.g. perks, salary bracket etc. Here we go ..

@Embeddable
class Designation {
  //.. attributes
  //.. behavior
  //.. immutable
}

and the Employee entity ..

@Entity
class Employee {
  //.. attributes
  private Designation designation;
  //.. other attributes
  //..
}

What about the relational model ? We can employ a nice little trick here ..

Clearly many employees share a designation - hence, theoretically speaking, Designation is an entity (and not a value object) in the relational model, having a 1..n association with the Employee table. But, as Eric Evans has suggested in his discussion on Tuning a Database with Value Objects, there may be situations when it is better to apply denormalization techniques for the sake of storing collocated data. Making Designation an entity and wiring a relationship with Employee through its identity will store the Designation table in a far away physical location, leading to extra page fetches and additional access time. As an alternative, if access time is more critical than physical storage, we can store copies of Designation information with the Employee table itself. And, doing so, Designation turns into a Value Object for the relational model as well! In real world use cases, I have found this technique to be an extremely helpful one - hence thought of sharing the tip with all the readers of this blog.

However, we are not done yet - in fact, the subject of this paragraph is decoupled instantiation models for value objects, and we haven't yet started the tango. We first had to set the stage to make Designation a value object at both the levels - domain and persistence models. Now let us find out how we can optimize our object creation at the domain layer while leaving the persistence level to our JPA implementation.

In a typical use case of the application, we may have bulk creation of employees, which may lead to a bulk creation of value objects. One of the cool features of using JPA is that we can adopt a completely different instantiation strategy for our OO domain model and the relational persistent model. While persisting the value object Designation, we are embedding it within the Employee entity - hence there is always a copy of the value object associated with the persistent Employee model. And this is completely managed by the JPA implementation of the ORM. However, for the domain model, we can control the number of distinct instances of the value object created using the Flyweight design pattern. Have a look ..

@Embeddable
class Designation {
  //.. persistent attributes

  @Transient
  private static Map<String, Designation> designations
    = new HashMap<String, Designation>();

  // package scope
  Designation() {
    //.. need this for Hibernate
  }

  // factory method
  public static Designation create(..) {
    Designation d = null;
    if ((d = designations.get(..)) != null) {
      return d;
    }
    // create new designation
    // put it in the map
    // and return
  }
  //..
  //..equals(), hashCode() .. etc.
}

We have a flyweight that manages a local cache of distinct designations created and controls the number of objects instantiated. And since value objects are immutable, they can be freely shared across entities in the domain model. Here is an example where using JPA we can decouple the instantiation strategy of the domain objects from the persistence layer. Although we are storing value objects by-value in the database, we need not have distinct in-memory instances in our domain model. And, if you are using Hibernate, you need not have a public constructor as well. For generation of proxy, Hibernate recommends at least package visibility, which works fine with our strategy of controlling instantiation at the domain layer using flyweights.

Value objects are invaluable in making designs more manageable and flexible. And JPA provides great support in transparent handling of instantiation and persistence of value objects along with their owning entities. With a rich domain model, backed up up by a great ORM like Hibernate that implements JPA, we can get the best of both worlds - powerful OO abstractions as well as transparent handling of their persistence in the relational database. I had earlier blogged about injecting ORM backed repositories for transparent data access in a domain model. In future installments of this series, I plan to cover more on this subject describing real life use cases of applying domain driven design techniques using JPA and Hibernate.

Monday, September 24, 2007

Party time Over for Rails .. It's time to Deliver !

People have increasingly started talking about Erlang and putting the concurrency-idol on the hot seat for an upcoming session of dissecting its anatomy. Possibly the fan fare of Rails has started taking a backseat. Is this a sign of maturity for Ruby on Rails ? Or a case of disillusionment ?

As Brian points out :

In terms of the Gartner Hype Cycle, Ruby and Rails are nearing the end of the "Peak of Inflated Expectations" and are moving into the "Trough of Disillusionment".

It's indeed true that Rails have enjoyed the limelight of being the darling of the community and a promise (or panacea) to put an end to the verbosity and *inelegance* of Java based Web development. It has been quite some time though, yet we hear people like Obie utter the following in the context of choosing Java over Rails in developing Big Enterprise Applications :

Like Gavin pointed out, you can't do an app with thousands of model classes in Rails. Okay, but I'm really not trying to say that you should try. Right now, I personally have no interest in working on those kinds of monstrosities, but to each his own. If that big enterprise application exposed web interfaces, I might be inclined to use JRuby on Rails to do those..

It's time to deliver for Rails ! How long will Rails shrug away from the enterprise applications branding them as monstrous ? It's time to take Rails out of the playful scaffolding model that delivers basic CRUD applications at lightening speed. To deliver complex enterprise applications, you need a performant application server with production quality deployment support, a real fast virtual machine and solid domain modeling capabilities. We all know the ActiveRecord model is never going to scale up for large complex domain models. The Rails community, till now, do not acknowledge how important it is to have a separate domain model, decoupled (or loosely coupled) with the persistence layer. And ActiveRecord will never take you towards that end. How much strong metaprogramming capabilities you may use to offer fancy dynamic finders in Rails, ultimately you need to have a strong decoupled domain model to deliver big enterprise applications. Till then, I guess, JRuby will be your best bet and Rails will remain just another Java library sucked up within a Java application server.

Thursday, September 20, 2007

Domain Modeling with JPA - The Gotchas - Part 1 - Immutable Entities

Working on rich domain models, how many times have we wondered if we had a cookbook of best practices that discussed some of the common gotchas that we face everyday. With this post, I would like to start a mini series of such discussions, which will center around issues in rich domain modeling using JPA and Hibernate. In course of this series, I will discuss many of the problems, for which I have no satisfactory solution. I would love to have the feedback from the community for the best practices and what they feel should be done to address such issues.

Each installment of this series will discuss a single issue relevant only to designing rich domain models using JPA and Hibernate. With enough reader feedback, we should be able to have a healthy discussion on how to address it in the real world of domain modeling using the principles of Domain Driven Design. The topic of today's post is Immutability of Domain Entities and how to address this issue in the context of modeling persistent entities. While no one denies the fact that immutability is a virtue that a model should maximally possess, still there are practical concerns and reasons to act otherwise in many situations. The domain model is only one of the layers in the application architecture stack - we need to interact with other layers as well. And this is where things start getting interesting and often deviate from the ideal world.

Immutability (aka public setter methods)

There are two aspects to immutability of entities.

#1 : Making an Entity intrisically Immutable

An entity is immutable in the sense that *no* update or delete are allowed for that entity. Once created the entity is truly *immutable*. The Hibernate annotation @Immutable can be used to indicate that the entity may not be updated or deleted by the application. This also allows Hibernate to make some minor performance optimizations.

@Immutable
public class Bid implements Serializable {
  // ..
  // ..
}

Needless to say, we do not have any setter methods exposed for this entity.

Immutability can also be ensured for selective columns or properties. In that case the setter method is not exposed for this property and the ORM generated update statement also does not include any of these columns.

@Entity
public class Flight implements Serializable {
  // ..
  @Column(updatable = false, name = "flight_name", nullable = false, length=50)
  public String getName() { ... }
  // ..
}

For the above entity, the name property is mapped to the flight_name column, which is not nullable, has a length of 50 and is not updatable (making the property immutable).

#2 : Immutable in the domain layer

This is one of the most debated areas in domain modeling. Should you allow public setters to be exposed for all entities ? Or you would like to handle all mutations through domain methods only. Either way there are some issues to consider :

If we expose public setters, then we risk exposing the domain model. Any component in the layers above (e.g. Web layer) can invoke the setter on the entity and make the domain model inconsistent. e.g. the Web layer may invoke account.setOpeningBalance(0), despite the fact that there is a minimum balance check associated with the domain logic. We can have that validation within the setter itself, but ideally that domain logic should be there in the domain method, which is named following the Ubiquitous Language. In the current case, we should have a method named account.open(..), which should encapsulate all the domain logic associated with the opening of an account. This is one of the fundamental tenets of rich domain models. From this point of view, smart setters are an anti-pattern in domain modeling.

If we do not have setters exposed, then how will the Web MVC framework transport data from the Form objects to the domain objects ? The usual way frameworks like Spring MVC works is to use the public setter methods to set the command object values. One solution is to use DTOs (or one of its variants), but that is again one of the dark corners which needs a separate post of its own. We would definitely like to make a maximal reuse of our domain model and try to use them throughout the application architecture stack. One of the options that I have used is to have public setters exposed but control the usage of setters only in the Web MVC layer through development aspects. Another option may be to introduce public setters through Inter Type Declaration using aspects and use the introduced interface in the Web MVC layer only. Both of these options, while not very intuitive, deliver the goods in practice. You can do away with exposing setters *globally* from the domain model.

What about getters ? Should we have

Collection<Employee> getEmployees() {
    return employees;
}

Collection<Employee> getEmployees() {
    return Collections.unmodifiableList(employees);
}

In the former case, the collection returned is not immutable and we have the convenience of doing the following ..

Employee emp = ...  // make an employee
office.getEmployees().add(emp);  // add him to the office

while all such logic will have to be routed through specific domain methods in case of the enforced immutability of the returned collection for the second option. While this looks more ideal, the first approach also has lots of valid use cases and pragmatic usage.

Use the comments section of the post to discuss what approach you take when designing rich domain models. There may not be one single golden rule to follow in all scenarios, but we can know about some of the best practices followed in the community.

Monday, September 17, 2007

Code-as-Data, Encapsulation and the Lisp Dogma

Raganwald talks about code/data separation and encapsulation. Here he quotes Steve Yegge from one of his drunken rants, where the latter points out the virtues of using Lisp s-expressions as executable XML.

Just could not resist ruminating Douglas Hofstadter in his Metamagical Themas on the same subject in his essay Lisp: Recursion and Generality. He talks about Lisp as the medium that unifies the *inert* data (which he calls declarative knowledge) with *active* code (or procedural knowledge). He mentions ..

The main idea is that in Lisp, one has the ability to "elevate" an inert, information-containing data structure to the level of "animate agent", where it becomes a manipulator of inert structures itself. This program-data cycle, or loop, can continue on and on, with structures reaching out, twisting back, and indirectly modifying themselves or related structures.

Talking about this data-code duality he goes on ..

Moreover, Lisp's loop of program and data should remind biologists of the way that genes dictate the form of enzymes, and enzymes manipulate genes (among other things). Thus Lisp's procedural-declarative program-data loop provides a primitive, but very useful and tangible example of one of the most fundamental patterns at the base of life: the ability of passive structures to control their own destiny, by creating and regulating active structures whose form they dictate.

Steve Yegge has also expressed it succinctly in the same post (mentioned above) ..

But Lisp is directly executable, so you could simply make the tag names functions that automatically transform themselves. It'd be a lot easier than using XSLT, and less than a tenth the size.

When you have code-as-data and data-as-code, you have encapsulated the data structures in a form where they can transform themselves. While Lisp allows you to do this, is it too unnatural for a programming language that forces you to program in its native abstract syntax tree format?

Monday, September 10, 2007

Got Closures ? Have OO

In the classical object oriented model, an object encapsulates local state (instance variables) and contains a pointer to the shared procedures. These procedures are the methods, which operate on the encapsulated state, that forms the environment of the object. Each method can declare local variables as well as look up for additional state information from the shared environment based on lexical scoping rules. So we have the object as the combination of the environment and the set of methods that operate on the environment. Class based languages like Java and C++ provide another abstraction - the class, which instantiates objects by initializing the environment and setting up appropriate pointers to the shared procedures.

But what if my programming language is classless ? One of the best examples of this is the Javascript language. It is based on prototypes, without the class structure and supports higher order functions and lexical closures. How do I implement object-orientation in Javascript ? Javascript is a prototypal language without any built in support for classes. This post attempts to look into OO through a different looking glass, a quite unnatural source, a quite different language and a very much different programming paradigm. In the absence of first class support for the classical OO paradigm, this post looks at alternative means of implementing encapsulation and object-orientation while designing real world abstractions. The language used is Scheme, popularly referred to as a functional language, and the implementation addresses all issues faced by the other numerous classless languages mentioned above.

The basic theme of the following discussion is from SICP, a true classic of Computer Science in the domain of programming languages. In case you have any doubt regarding the credibility of SICP, please go through the first two customer reviews of this book in Amazon.

Object Orientation - the Scheme way !

Let us look at the following abstraction of a bank account in Scheme ..

(define make-account
  (lambda (account-no account-name balance)

    ;; accessors
    (define (get-account-no) account-no)
    (define (get-name) account-name)
    (define (get-balance) balance)

    ;; mutators
    (define (set-account-name! new-account-name)
      (set! account-name new-account-name)
      account-name)

    ;; deposit money into account
    (define (deposit! amount)
      (set! balance (+ balance amount))
      balance)

    ;; withdraw money
    (define (withdraw! amount)
      (if (>= balance amount)
          (begin (set! balance (- balance amount))
                 balance)
          "Insufficient funds"))

    ;; the dispatcher
    (define (self message)
      (case message
        ((no)         get-account-no)
        ((nm)         get-name)
        ((balance)    get-balance)
        ((set-nm!)    set_account-name!)
        ((withdraw!)  withdraw!)
        ((deposit!)   deposit!)
        (else (error "unknown selector" message))))
    self))

Every time we call make-account, we get back a dispatch function, every instance of which shares the same code, but operates on the different set of data supplied to it, which forms the execution environment.

;; creates an account a1
(define a1 (make-account 100 "abc" 0))

;; creates an account a2
(define a2 (make-account 200 "xyz" 0))

;; fetches the account-no of a1 -> 100
((a1 'no))

;; sets the account-name of a1 to "pqr"
((a1 'set-nm!) "pqr")

That is, we have two separate objects for the account abstraction and a bunch of shared methods to operate on each of them. A clean separation of code and data, a nice encapsulation of state. The arguments passed to the make-account invocation, account-no, account-name and balance are completely encapsulated within the abstraction and can only be accessed through the shared procedures.

The Secret Sauce

Lexical closures! In the above code snippet, all free variables within the methods are looked up from the environment of execution based on the lexical scoping principles of Scheme. This is exactly similar to the classical OO model that we discussed in the beginning. In the model with first class objects, we have

the environment formed by the instance variables, encapsulating local state per object and

the methods, which are shared across objects.

These two orchestrate the mechanism through which behaviors are implemented in abstractions. OTOH in languages like Scheme and Javascript, the corresponding roles are played by

the execution context, where the procedures look up for free variables and

the procedures themselves.

Hence closures play the same role in the Scheme based implementation that objects play in a classical one with Java or C++. Here is what wikipedia has to say :

a closure is a function that is evaluated in an environment containing one or more bound variables. When called, the function can access these variables. The explicit use of closures is associated with functional programming and with languages such as ML and Lisp. Constructs such as objects in other languages can also be modeled with closures.

What about Inheritance ?

We can implement inheritance in the above abstraction by using the delegation model. This is an object based implementation, similar to what we do in classless languages like Javascript. Simply incorporate the specialized behaviors in a new abstraction and delegate the common behaviors to the base abstraction. The following example implements a minimum-balance-checking-account, which has an additional restriction on the withdraw method in the form of a minimum balance check. It delegates all common behavior to the earlier abstraction make-account, while itself implements only the specialized functionality within the withdraw method.

(define make-min-balance-account
  (lambda (account-no account-name balance)
    (let ((account (make-account account-no account-name balance)))

      ;; implement only the specialized behavior
      ;; delegate others to the base abstraction
      (define (withdraw! amount)
        (let ((bal ((account 'balance))))
          (if (>= (- bal amount) 1000)
              (begin ((account 'withdraw!) amount)
                     ((account 'balance)))
              "Min balance check failed")))

      (define (self message)
        (case message
          ((withdraw!)    withdraw!)
          (else (account message))))
      self)))

But Scheme is a Functional Language

Programming without assignments is functional programming, where procedures can be viewed as computing mathematical functions, without any change in the local state. Scheme is a multi-paradigm language and the above implementation uses assignments to mutate the local states of the abstraction. The set! operation used in the above implementation helps us model local states of objects. However, it is absolutely possible to provide a completely functional-OO system implementing polymorphism using Scheme that does not have a single assignment or mutator operation. See here for a sample implementation.

Are Closures equivalent to Objects ?

This is a very interesting topic and has been extensively discussed in various forums in the theory of programming languages. While objects are often referred to as "poor man's closures", the converse has also been shown to be true. Instead of trying to mess around with this tension between opposites, let me point you to a fascinating note on this topic in one of the forums of discussion.

Tuesday, September 04, 2007

Do you comment your code ?

Paul Graham on comments in code ..

Incidentally, I am not a big fan of comments. I think they are often an artifact of using weak languages. I think that programming languages should be a good enough way to express programs that you don't have to scrawl additional clarifications on your source code. It would be a bad sign, don't you think, if a novelist had to print notes in the margins saying "she left without saying anything because she was angry about the trampled petunias?" It is the job of the novel to make that clear. I think this is what SICP means when they say "programs must be written for people to read, and only incidentally for machines to execute." I use comments mostly to apologize/warn about hacks, kludges, limitations, etc.

This is in the context of PG defending his claim Succinctness is Power for a programming language. Love him or hate him, you cannot ignore him. I think he is a true thoughtleader in the programming languages community ..