Monday, October 08, 2007

Domain Modeling with JPA - The Gotchas - Part 3 - A Tip on Abstracting Relationships

JPA is all about POJOs and all relationships are managed as associations between POJOs. All JPA implementations are based on best practices that complement an ideal POJO based domain model. In the first part of this series, I had talked about immutable entities, which prevent clients from inadvertent mutation of the domain model. Mutation mainly affects associations between entities thereby making the aggregate inconsistent. In this post, I will discuss some of the gotchas and best practices of abstracting associations between entities from client code making your domain model much more robust.

Maintaining relationships between entities is done by the underlying JPA implementation. But it the responsibility of the respective entities to set them up based on the cardinalities specified as part of the mapping. This setting up of relationships has to be explicit through appropriate message passing between the respective POJOs. Let us consider the following relationship :


@Entity
public class Employee implements Serializable {
  //..
  //.. properties

  @ManyToMany
  @JoinTable(
    name="ref_employee_skill",
    joinColumns=@JoinColumn(name="employee_pk", referencedColumnName="employee_pk"),
    inverseJoinColumns=@JoinColumn(name="skill_pk", referencedColumnName="skill_pk")
  )
  private Set<Skill> skills;
  //..
  //.. other properties
  //.. methods
}



There is a many-to-many relationship between Employee and Skill entities, which is set up appropriately using proper annotations. Here are the respective accessors and mutators that help us manage this relationship on the Employee side :


@Entity
public class Employee implements Serializable {
  //..
  //.. properties

  public Set<Skill> getSkills() {
    return skills;
  }

  public void setSkills(Set<Skill> skills) {
    this.skills = skills;
  }
  //..
}



Similarly on the Skill entity we will have the corresponding annotations and the respective accessors and mutators for the employees collection ..


@Entity
public class Skill implements Serializable {
  //..
  //.. properties

  @ManyToMany(
    mappedBy="skills"
  )
  private Set<Employee> employees;

  public Set<Employee> getEmployees() {
    return employees;
  }

  public void setEmployees(Set<Employee> employees) {
    this.employees = employees;
  }
  //..
}



Can you see the problem in the above model ?

The problem lies in the fact that the model is vulnerable to inadvertent mutation. Public setters are evil and exposes the model to be changed inconsistently by the client. How ? Let us look at the following snippet of the client code trying to set up the domain relationships between an Employee and a collection of Skills ..


Employee emp = .. ; // managed
Set<Skill> skills = .. ; //managed
emp.setSkills(skills);



The public setter sets the skillset of the employee to the set skills. But what about the back reference ? Every skill should also point to the Employee emp. And this needs to be done explicitly by the client code.


for(Skill skill : skills) {
  skill.getEmployees().add(emp);
}



This completes the relationship management code on part of the client. But is this the best level of abstraction that we can offer from the domain model ?

Try this !

If the setter can make your model inconsistent, do not make it public. Hibernate does not mandate public setters for its own working. Replace public setters with domain friendly APIs, which make more sense to your client. How about addSkill(..) ?


@Entity
public class Employee implements Serializable {
  //..
  //.. properties

  public Employee addSkill(final Skill skill) {
    skills.add(skill);
    skill.addEmployee(this);
    return this;
  }
  //..
}



addSkill() adds a skill to an employee. Internally it updates the collection of skills and best of all, transparently manages both sides of the relationship. And returns the current Employee instance to make it a FluentInterface. Now your client can use your API as ..


Employee emp = .. ; // managed
emp.add(skill_1)
   .add(skill_2)
   .add(skill_3);



Nice!

For clients holding a collection of skills (managed), add another helper method ..


@Entity
public class Employee implements Serializable {
  //..
  //.. properties

  public Employee addSkill(final Skill skill) {
    //.. as above
  }

  public Employee addSkills(final Set<Skill> skills) {
    skills.addAll(skills);

    for(Skill skill : skills) {
      skill.addEmployee(this);
    }
    return this;
  }
  //..
}



Both the above methods abstract away the mechanics of relationship management from the client and present fluent interfaces which are much more friendlier for the domain. Remove the public setter and don't forget to make the getter return an immutable collection ..


@Entity
public class Employee implements Serializable {
  //..
  //.. properties

  public Set<Skill> getSkills() {
    return Collections.unmodifiableSet(skills);
  }
  //..
}



The above approach makes your model more robust and domain friendly by abstracting away the mechanics of relationship management from your client code. Now, as an astute reader you must be wondering how would you use this domain entity as the command object of you controller layer in MVC frameworks - the setters are no longer there as public methods ! Well, that is the subject of another post, another day.

12 comments:

Mikhail.Khludnev said...

just to help you make next step....

Hello

It would be great if we will able to work with add/get/removeSkill as with usual collection. isn't? Of course descision is simple - use collection adapter to reflect colection changes to bidirectional relationship. The question is "How?"

Second. What do you think about removeSkill() implementation. It seems that there is one trap with it.

Thanks for your blog. It's appreciated.

Unknown said...

@mikhail: +1 on your observation that it will be fun to work with collection adapters. That is what I meant by abstracting relationships in the post. At the same time we need to be careful that we do not expose any modifiable collection as part of the getter.

>> It seems that there is one trap with it.

Here is my take ..
On the employee side ..
class Employee {
//..
private Set<Skill> skills;

//..
public Employee removeSkill(final Skill skill) {
skill.removeEmployee(this);
skills.remove(skill);
return this;
}
//..
}

and on the Skill side ..

class Skill {
//..
private Set<Employee> employees;
//..
Skill removeEmployee(final Employee employee) {
employees.remove(employee);
return this;
}
//..
}

Note that the method on the Skill side is NOT public - it is package protected and is used in the implementation of Employee.removeSkill().

Mikhail.Khludnev said...

Thanks for you attention

First, Do we have an apropriate Observable Collection in Java (to implement an adapter)?

Second, It seems that you've been caughted...
the many-to-many remove problem is here:
skill.removeEmployee(this);

Imagine, You obtain Emploee from Repository (actually you've got a proxy!!!) and invoke

emploee.removeSkill(skill);

Lazy proxy delegates this call to tha target entity. So, here skill.removeEmployee(this); this points to the target entity (not proxy!!!). But skill.emploees actualy is collection of proxies!!!. So, the problem is how to find proxy by entity in collection to remove it. By default entity doesn't equal to its proxy!!!

We have a several workaround for this issue. Mike Keith proposes to implement equals method accordingly for this case. (but I don't like to override Object.equals() and think that it can affect some behavior...)

What do you think? Am I wrong? Am I dumb?

Anyway thank you.

Unknown said...

@mikhail: Thanks for your insightful observation. I understand the problem that u r talking about and ur suggested workarounds.

I am using Hibernate as the JPA implementation. Hibernate does not use the native java Collection, it uses a custom implementation PersistentCollection. I have not yet dug deep into what it does behind the doors. But looks like it does some magic to initialize the collection at the right time. I tried putting up my example in the debugger and it shows the collection skill.employees as initialized before the remove is called. I checked up that the fetchType is specified as FetchType.LAZY (which is the default for many 2 many).

Would love to hear if you have any observation on this. Also, will love to get any more pointers from you on this issue.

Cheers.

Mikhail.Khludnev said...

@Debasish
problem isn't in the collection initialization.

just add assertion to your code:

// i inlined removeEmployee
public Employee removeSkill(final Skill skill) {
boolean changed1=skill.emploees.remove(this);
boolean changed2=skills.remove(skill);
assert changed1==changed2
return this;
}

obtain two associated entity like proxy from repositories (emploee and skill) and try to drop its accociation.
You will take changed1==false because this (entity) can't be founded in collection of proxies.

I hope it helps

Unknown said...

@mikhail: Very true. Whatever u have mentioned holds exactly true when we are working with proxies at both ends. i.e. when I fetch employee using EntityManager.getReference(). In my case I was using EntityManager.find() - hence the Employee instance is NOT a proxy, though the many 2 many collection skills is lazily loaded. Now when I do a skillRepository.find() to get the skill which belongs to the same employee, then the Collection skill.employees is no longer a Collection of proxies. Hence skill.employees.remove(this) works ok and I get a true.

However, the situation changes if I use getReference() instead of the find(). I get both employee and skill as proxies and the remove() fails (as it should be) - I get changed1 as false.

Here's one way to fix this, which u have also mentioned, though u do not like it. I override equals() in Employee ..

@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || !(o instanceof Employee))
return false;
Employee emp = (Employee) o;
if (employeePk != 0 ? employeePk != emp.getEmployeePk() : emp.getEmployeePk() != 0)
return false;
return true;
}

Thanks a lot for your observations and for reading my blog.

Mikhail.Khludnev said...

Hi,

it seems that this code can be a little bit pretty than now:
if (employeePk != 0 ? employeePk != emp.getEmployeePk() : emp.getEmployeePk() != 0)
return false;
return true;


... hm... nevermind

But, Why don't like to override equals() and hashCode()? for example your code changes equivalence of entity after obtaining PK (going to presistent state). It's not a pretty god idea because you can "loose" you entity in the HashMap or HashSet. (I mean the followed scenario: put fresh transiet instance to HashSet, persist it in Hibernate session, it leads to putting PK into entity, it leads to changig its hashCode, so you can't find it in those HashSet)

Have a good time

SornET said...

What about using Iterator to recover collection values?

Mikhail.Khludnev said...

2 SornET:

didn't get your thought about "recover"?

SornET said...

I'm sorry?

Unknown said...

@Sornet:
It is not clear what you mean by "recover" from the collection. Whenever you try to remove an element from a collection based on a value, the equals() method kicks in. And this is where the problem comes in for the example stated. One being a collection of objects and the other a collection of proxies.

Will be helpful if you expalin your thoughts in a bit more detail.

Anonymous said...

@mikhail:

You usually do not include the primary key, i.e. the ID of an object, in the equals() and hashCode()-methods. You don't want to determine database identity (equality), but object identity. You should only include "significant" fields, as Joshua Bloch put it in his book "Effective Java Programming Language Guide". Here is an example chapter form this book talking about the equals() and hashCode() methods: http://java.sun.com/developer/Books/effectivejava/Chapter3.pdf