Implementing an equals method in Java can be quite complicated. Fortunately there are numerous document around the web with useful tips, hints and frameworks to assist you in this process. However, an implementation of the equals method that is technically correct doesn’t have to make any sense functionally. In Domain Driven Design, your domain model implementation is the beating heart of your application. Everything has to make perfect (functional) sense in there. Having good equals methods is of vital importance there.

In this article, I will elaborate on some common pitfalls you can encounter when implementing the equals method, as well as some sensible guidelines.

A lot of IDE’s nowadays allow you to generate technically perfect and compliant implementations of the equals method for any object. You simply choose a number of properties you wish to include in the comparison, indicate some of them as being non-null values and voila. Write a small unit test for the thing, commit the whole shebang and you’re done. Well, not quite. You’ve probably got an implementation even worse than the one provided by Object.

I’ve seen developers generate equals methods in mere seconds. Although managers tend to love this sort of “productivity”, as an architect doing code reviews, I measure productivity differently. I doubt if any developer can properly evaluate the functional value of such an implementation in just seconds.

First, we need to define what equals really means. To me, when to objects are equals, it means they are to such a degree identical to each other, that they can be replaced without side effects. In other words, if two objects are equal, it doesn’t matter which one you pick. In math, equality is very well defined. In software, it is a little harder to achieve that level of definitions. But that doesn’t mean we shouldn’t try.

As the title of this article suggests, I want to look at the equals method in the context of Domain Driven Design (DDD). In DDD, it’s all about making concepts explicit. In our case, it means answering the question: “What does it mean, for two [fill in the blank] to be equal?” Of course, there isn’t really a single generic answer for all objects. However, objects can be classified into a few major groups in DDD: Entities, Value Objects and (Domain)Services. The last of the three is not really of much interest in this context, so let’s focus on the other two.

Value objects

Value objects are immutable objects of which only the properties are of importance. They carry no concept of identity. A perfect example of a value object in daily life is money. Personally, I don’t care which 10 euro bill I carry, as long as it is a valid 10 euro’s. You can swap it with one owned by a friend and you won’t feel better or worse about it. The two bills are equal.

Because of the immutability in value objects, testing them for equality is rather easy. Generally, you can just generate an equals method using all of the (exposed) properties of the object. In our example, as long as the currency and the amount is the same, we don’t really care which instance of the bill we carry with us.

Therefore: the default choice for the equals method on value objects should be to include all (exposed) properties of that value object in the comparison.

Entities

En entity is “an object fundamentally defined not by its attributes, but by a thread of continuity and identity.” In daily life, having the same name as somebody else doesn’t make you the same. This form of mistaken identity can lead to huge problems in an application.

But back to our equals implementation. What does it mean for two entities to be equal? Well, it should mean that they can replace each other without side effects. If they can’t replace each other, they can’t really be equal. This means that, for to entities to be equal, at least their identity should be equal.

But entities have mutable state. To what degree do you want to use that state in the comparison? That really depends on the context of the comparison. If you want to know if the state has been modified between two copies of the instance, you will need an equals method that checks on all mutable properties as well as the identity. If you are only interested in knowing whether you are talking about an object representation of the actual same thing, identity comparison is the only thing you need.

If there is one thing an equals method cannot do, it is to look at the context and intentions of the caller. Since it extremely important to use “intention revealing interfaces”, an equals method on an entity is probably not the right way to go. In a discussion with Eric Evans, he explained that he prefers not to implement equals on entities at all, and instead provide other comparison methods, such as “hasSameIdentityAs”. This method clearly states what it means to be the same. Depending on the context and intention of your comparison, you call another method.

Let’s go back to the statement about equality: when two objects are equal, it means that you can replace one with the other without side effects.

Replacing one entity instance with another is dangerous in most circumstances. If they have the same identifier, they might have different state. The different instances are likely to be used by different threads with different intentions. When using persistence frameworks like JPA, your entity is likely to be attached to a persistence context, meaning that replacing them without side effect is out of the question.

This probably means that this statement is a little too rigid for entities. Mechanisms like the Set rely on the equals method to decide whether you are allowed to add an item or not.

Personally, I like to define two entities as equal when you are talking about the representation of the same actual thing. Hence, when the type and identity of the two are the same. This means my choice of equals method will only take the actual class and identity into consideration.

This gives us a nice problem when identity is provided by the persistence framework at the time an object is persisted. How do you measure equality when one or both instances do not (yet) have an identity? Although you cannot (or should not) try to predict the identity of those instances, there is one thing you can say about it. If two different instances have no identity, there is no way a persistence framework will assign them the same identity. This means that you can revert to the default implementation of equals in that case (practically doing a == comparison). The same goes for comparison of an entity with identity and one without: they will never, ever have the same identity in the future.

Conclusion

Before implementing the equals method, think clearly about the type of object that you are comparing. If it is an immutable value object, you should include all (exposed) properties of the value object. If it is an entity, be very cautious and first define what equality really means. Consider using methods with intention revealing interfaces, such as “hasSameIdentityAs” and “hasSameStateAs”.

Tagged on:         

6 thoughts on “Domain Driven Design and the equals method

  • July 30, 2009 at 7:26 am
    Permalink

    this is the type of discussion that is really nice to have face-to-face. As you can see, the value of an equals method really depends on the context. In the case Developer Dude describes, you are in charge (in the GUI) of calling the equals method. What you really care about is full state comparison. Why not call a method “hasSameState” instead? It explains better what you want to do.

    In some cases, you don’t have any control about the method used. As Developer Dude points out, some collections rely on equals (and hashcode) to work properly. The Set is a good example of this. It uses the equals method to see whether duplicates exist. In my opinion, this just means that (for entities), the identity needs to be evaluated in the equals method, not the rest of the object. Consider this case: we have a Human, with name “Allard” (in our case the name is identity). Now, we create a copy of this instance and change the Address. In my context, we are still talking about the same person, me. So we don’t want to add “me” to the list twice. If we were to include all properties in the equals method, the Set would just accept “me” twice, without a problem.

    I am saying that in your own implementation, you should try not to rely on equals, as it means too many things in different contexts. Instead, create a method with a name that really clarifies what the intent of the comparison is.

    Developer Dude, if you want to read up on DDD, there is a free version of Eric Evans’ book available on InfoQ: http://www.infoq.com/minibooks/domain-driven-design-quickly

  • July 30, 2009 at 1:38 am
    Permalink

    Allard,

    I need to read up on DDD (especially before commenting on articles about domain objects obviously), but I think for the most generic use case of a domain object, equals should be the simplest case – property for property comparison, then the more specialized method names should be used for those use cases where you need more differentiation/explanation/meaning. Especially since equals and hashcode are used extensively in collections.

    Of course (it just occurred to me), that maybe we are saying the same thing – that this is what you mean.

    Also, I should have explained more when I said NOT to compare against subclasses/interfaces – I don’t mean never, I just mean this should be the default, which it usually isn’t in most implementations of equals. Again, what is the most usual case (80+ percent of the time)?

    As for immutable objects. I definitely see the value of them, but I have tried to use them practically, especially with persistence and GUI strategies, and they become unwieldy (at least the patterns I have seen for creating/manipulating them). Now maybe I haven’t seen some new pattern/methodology for writing an immutable bean with 20+ properties that doesn’t depend on a constructor with a huge list of arguments (which doesn’t always work well with many frameworks that expect setters and support construction via constructors as something of an afterthought, or at least not near as conveniently as setters/getters – I am thinking of Spring and iBatis as examples).

    Most of the time I have not used immutable objects for anything more than a simple small value object that is not really used for a core domain problem, but rather it was inside of some utility class (like for sending an email).

  • July 30, 2009 at 1:24 am
    Permalink

    Greyfairer,

    I don’t think I would want to have one ‘Value Object’ (not necessarily immutable for my definition – just a simple Java bean with getters/setters and no real behavior, used for passing around the ‘value’ of a given abstraction) for the GUI and one for the persistence layer, both representing the same domain concept. I have seen such code and not only is it at least confusing why it is necessary, it leads to duplicated code, usually unnecessary code, poor code reuse, bugs, mismatches of various sorts (types, names, concepts, etc.).

    In my mind, I try to keep to the simpler concepts – code reuse, DRY and KISS, among others. I agree that context can change, and I do think this is one area where convention can have value. Where equals() for a POJO simply means comparing the public properties of the POJO one by one – and yes, even then there may be exceptions, but again I try to go by one of the simpler concepts; design using the 80/20 rule (actually, in my experience, it is more like 90/10) – at least the way I interpret it: yes, there will be exceptional cases, but cover the common ones first, then handle the exceptional ones as you encounter them. If my concept of equals() covers 90% of the use cases, then override the equals() method I wrote, or write another specially named equals method, for the other 10% of the time when you do need some special meaning. Don’t start out with a bunch of special meanings which will probably result in code bloat, when the generic use case applies just fine.

    As for a different form of domain object based on the type of persistence (if I am assuming correctly what you are inferring), yeah, I have seen that too. For example, I have seen a slightly different POJO for Hibernate get translated back and forth to/from the generic POJO, simply because the person who wrote the code didn’t understand how to reuse a plain POJO with Hibernate. Half an hour of refactoring and Hibernate was using the same POJO as everybody else and many bugs went away (which were undiscovered because the code had not written any unit tests).

    In short, I don’t think having different forms of the same domain object for different persistence strategies is a good thing at all. In fact, my knee jerk reaction would be ‘yuck’. What happens when you switch from Hibernate to iBatis? Do you have to write another set of domain objects now?

    Correct me if I am jumping to the wrong conclusion as to your statements regarding the GUI v. Hibernate using two different value objects (classes, not instances).

  • July 29, 2009 at 10:27 pm
    Permalink

    Hi Developer Dude,

    I agree with you in the GUI flow, as long as you are talking about ‘Value Objects’. But in real projects, most of your POJO’s will be Entities. And I have seen too many people abusing equals and clone for the GUI flow. In a Hibernate world, you never have two Java object instances representing the same persistent entity.

    If you want to edit an Entity in a GUI, I suggest you ‘clone’ the fields in another dedicated Value Object, and when done editing, you only compare those fields that have been edited. So don’t use the standard clone() and equals() methods, but rather e.g. getNameAndTitle(), hasSameNameAndTitle() and updateNameAndTitle().

    See also http://www.javaworld.com/javaworld/jw-09-2003/jw-0905-toolbox.html

  • July 29, 2009 at 6:45 pm
    Permalink

    “Developer Dude”,

    don’t get me wrong. I don’t think the equals methods IDE’s generate are crap. In fact, they generate technically perfectly correct implementations. The point I am trying to get across is that you should think about the meaning of equals before implementing it.

    Also note that with “Value Object”, I mean the value object as described by Eric Evans in Domain Driven Design. That means they are immutable. It is safe to share them, since they are immutable. Using this type of object is extremely safe and can even reduce complexity a lot. Modifying operations on them will just return a new instance with the new state, without changing the instance the method was called on.

    Implementing equals on entities for unit testing is dangerous. Testing is just another context that you are adding to an already complex (and ofter under evaluated) set of contexts. Instead, create some methods that clearly show what “equals” means. Bugs are right around the corner if you don’t pay attention.

    I don’t think it is fair to say you should never check for equality based on subclasses. It really depends on the context and the functional meaning of equality. However, doing so is very complex, since a.equals(b) should have the same result as b.equals(a). But that is a technical issue, which is not in the scope of this article.

    To summarize my point: equals is a dangerous method name, as it can mean a different thing in a different context, especially for entities. Try to use more meaningful (and intention revealing) method names instead.

  • July 29, 2009 at 5:45 pm
    Permalink

    The two methods I always implement in my POJOs/value objects/almost beans, are equals() and hashCode. It is well known that both are important in various collections (like HashMaps), and in my experience the value objects wind up in collections a lot.

    I also find equals() valuable for the UI where I need to tell if one copy of a particular value object has changed or not.

    The typical pattern is for the UI widget to be handed a reference to a value object. The widget doesn’t know (and should know) whether another component/thread/whatever has a reference to this particular instance, so it clones the object (implementing clone() correctly is another important issue). The clone allows it to not interfere with other copies of the object, and to know that it in turn won’t have its own rug pulled out from under it too. Sometimes a widget may want to keep yet another copy of the object in its original state for various reasons (to return to its default state if the user so desires, to know for sure if the state has become ‘dirty’, etc.). Equals is valuable for this.

    There are other uses for properly implemented equals() and hashcode() methods, including unit testing.

    You are correct about the IDEs – their equals methods are crap. I didn’t bother to look at their hashcode methods. I personally use Apache Commons EqualsBuilder and HashcodeBuilder and I preface those with some sanity checks:

    public boolean equals(final Object rhs_)
    {
    if (rhs_ == null) return false;

    if (rhs_.getClass() != getClass()) return false;
    // don’t compare by interface or base class, compare by runtime class.

    if (rhs_ == this) // same object reference – have to be equal
    return true;

    MyObject rhs = (MyObject) rhs_;

    // use EqualsBuilder for individual properties
    }

    I do not call the super.append() method on EqualsBuilder for the first class that extends Object. I sometimes have some special cases for collections – in many of my value objects, an empty collection is equal to a null collection for our purposes, especially since all of the get methods that return lists create an empty list to return if the list is null (it cuts down on a lot of bugs and makes for more concise/readable code).

    One of the key things I try to get across to others is that you should NOT check equality based on subclasses/interfaces. For example, if you have two implementations of Money, a Dollar is not equal to a Yen. A Car is not equal to a Truck, even though both are motor vehicles. Objects are only equal if they are the same exact class type.

    As for identity, this varies on your domain, in my experience, in the domains I have worked in, in the ways we have chosen to implement/use identities, we made no differentiation between using equals/hashcode for value objects that had identity and those that didn’t (the latter were usually contained within an object that had identity).

Comments are closed.