JavaRush /Java Blog /Random EN /Equals and hashCode contracts or whatever it is
Aleksandr Zimin
Level 1
Санкт-Петербург

Equals and hashCode contracts or whatever it is

Published in the Random EN group
The vast majority of Java programmers, of course, know that methods equalsare hashCodeclosely related to each other, and that it is advisable to override both of these methods in their classes consistently. A slightly smaller number know why this is so and what sad consequences can occur if this rule is broken. I propose to consider the concept of these methods, repeat their purpose and understand why they are so connected. I wrote this article, like the previous one about loading classes, for myself in order to finally reveal all the details of the issue and no longer return to third-party sources. Therefore, I will be glad to constructive criticism, because if there are gaps somewhere, they should be eliminated. The article, alas, turned out to be quite lengthy.

equals override rules

A method equals()is required in Java to confirm or deny the fact that two objects of the same origin are logically equal . That is, when comparing two objects, the programmer needs to understand whether their significant fields are equivalent . It is not necessary that all fields must be identical, since the method equals()implies logical equality . But sometimes there is no particular need to use this method. As they say, the easiest way to avoid problems using a particular mechanism is not to use it. It should also be noted that once you break a contract, equalsyou lose control of understanding how other objects and structures will interact with your object. And subsequently finding the cause of the error will be very difficult.

When not to override this method

  • When each instance of a class is unique.
  • To a greater extent, this applies to those classes that provide specific behavior rather than being designed to work with data. Such, for example, as the class Thread. For them equals, the implementation of the method provided by the class Objectis more than enough. Another example is enum classes ( Enum).
  • When in fact the class is not required to determine the equivalence of its instances.
  • For example, for a class java.util.Randomthere is no need at all to compare instances of the class with each other, determining whether they can return the same sequence of random numbers. Simply because the nature of this class doesn't even imply such behavior.
  • When the class you are extending already has its own implementation of the method equalsand the behavior of this implementation suits you.
  • For example, for classes Set, List, Mapthe implementation equalsis in AbstractSet, AbstractListand AbstractMaprespectively.
  • And finally, there is no need to override equalswhen the scope of your class is privateor package-privateand you are sure that this method will never be called.

equals contract

When overriding a method, equalsthe developer must adhere to the basic rules defined in the Java language specification.
  • Reflexivity
  • for any given value x, the expression x.equals(x)must return true.
    Given - meaning such thatx != null
  • Symmetry
  • for any given values x​​and y, x.equals(y)should return trueonly if it y.equals(x)returns true.
  • Transitivity
  • for any given values x, yand z, if x.equals(y)returns trueand y.equals(z)returns true, x.equals(z)must return the value true.
  • Consistency
  • for any given values, xand ythe repeated call x.equals(y)will return the value of the previous call to this method, provided that the fields used to compare the two objects did not change between calls.
  • Comparison null
  • for any given value xthe call x.equals(null)must return false.

equals contract violation

Many classes, such as those from the Java Collections Framework, depend on the implementation of the method equals(), so you should not neglect it, because Violation of the contract of this method can lead to irrational operation of the application, and in this case it will be quite difficult to find the reason. According to the principle of reflexivity , every object must be equivalent to itself. If this principle is violated, when we add an object to the collection and then search for it using the method, contains()we will not be able to find the object that we just added to the collection. The symmetry condition states that any two objects must be equal regardless of the order in which they are compared. For example, if you have a class containing only one field of string type, it will be incorrect to compare equalsthis field with a string in a method. Because in the case of a reverse comparison, the method will always return the value false.
// Нарушение симметричности
public class SomeStringify {
    private String s;

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o instanceof SomeStringify) {
            return s.equals(((SomeStringify) o).s);
        }
        // нарушение симметричности, классы разного происхождения
        if (o instanceof String) {
            return s.equals(o);
        }
        return false;
    }
}
//Правильное определение метода equals
@Override
public boolean equals(Object o) {
    if (this == o) return true;
    return o instanceof SomeStringify &&
            ((SomeStringify) o).s.equals(s);
}
From the condition of transitivity it follows that if any two of three objects are equal, then in this case all three must be equal. This principle can easily be violated when it is necessary to extend a certain base class by adding a meaningful component to it . For example, to a class Pointwith coordinates xand yyou need to add the color of the point by expanding it. To do this, you will need to declare a class ColorPointwith the appropriate field color. Thus, if in the extended class we call the equalsparent method, and in the parent we assume that only coordinates xand are compared y, then two points of different colors but with the same coordinates will be considered equal, which is incorrect. In this case, it is necessary to teach the derived class to distinguish colors. To do this, you can use two methods. But one will violate the rule of symmetry , and the second - transitivity .
// Первый способ, нарушая симметричность
// Метод переопределен в классе ColorPoint
@Override
public boolean equals(Object o) {
    if (!(o instanceof ColorPoint)) return false;
    return super.equals(o) && ((ColorPoint) o).color == color;
}
In this case, the call point.equals(colorPoint)will return the value true, and the comparison colorPoint.equals(point)will return false, because expects an object of “its” class. Thus, the rule of symmetry is violated. The second method involves doing a “blind” check in the case when there is no data about the color of the point, i.e. we have the class Point. Or check the color if information about it is available, that is, compare an object of the class ColorPoint.
// Метод переопределен в классе ColorPoint
@Override
public boolean equals(Object o) {
    if (!(o instanceof Point)) return false;

    // Слепая проверка
    if (!(o instanceof ColorPoint))
        return super.equals(o);

    // Полная проверка, включая цвет точки
    return super.equals(o) && ((ColorPoint) o).color == color;
}
The principle of transitivity is violated here as follows. Let's say there is a definition of the following objects:
ColorPoint p1 = new ColorPoint(1, 2, Color.RED);
Point p2 = new Point(1, 2);
ColorPoint p3 = new ColorPoint(1, 2, Color.BLUE);
Thus, although the equality p1.equals(p2)and is true p2.equals(p3), p1.equals(p3)it will return the value false. At the same time, the second method, in my opinion, looks less attractive, because In some cases, the algorithm may be blinded and not perform the comparison fully, and you may not know about it. A bit of poetry In general, as I understand it, there is no concrete solution to this problem. There is an opinion from one authoritative author named Kay Horstmann that you can replace the use of the operator instanceofwith a method call getClass()that returns the class of the object and, before you start comparing the objects themselves, make sure that they are of the same type, and do not pay attention to the fact of their common origin. Thus, the rules of symmetry and transitivity will be satisfied. But at the same time, on the other side of the barricade stands another author, no less respected in wide circles, Joshua Bloch, who believes that this approach violates the substitution principle of Barbara Liskov. This principle states that “calling code must treat a base class in the same way as its subclasses without knowing it . ” And in the solution proposed by Horstmann, this principle is clearly violated, since it depends on the implementation. In short, it is clear that the matter is dark. It should also be noted that Horstmann clarifies the rule for applying his approach and writes in plain English that you need to decide on a strategy when designing classes, and if equality testing will be carried out only by the superclass, you can do this by performing the operation instanceof. Otherwise, when the semantics of the check changes depending on the derived class and the implementation of the method needs to be moved down the hierarchy, you must use the method getClass(). Joshua Bloch, in turn, proposes to abandon inheritance and use object composition by including a ColorPointclass in the class Pointand providing an access method asPoint()to obtain information specifically about the point. This will avoid breaking all the rules, but, in my opinion, it will make the code more difficult to understand. The third option is to use the automatic generation of the equals method using the IDE. Idea, by the way, reproduces Horstmann generation, allowing you to choose a strategy for implementing a method in a superclass or in its descendants. Finally, the next consistency rule states that even if the objects do xnot ychange, calling them again x.equals(y)must return the same value as before. The final rule is that no object should be equal to null. Everything is clear here null- this is uncertainty, is the object equal to uncertainty? It is not clear, i.e. false.

General algorithm for determining equals

  1. Check for equality of object references thisand method parameters o.
    if (this == o) return true;
  2. Check whether the link is defined o, i.e. whether it is null.
    If in the future, when comparing object types, the operator will be used instanceof, this item can be skipped, since this parameter returns falsein this case null instanceof Object.
  3. Compare object types thisusing oan operator instanceofor method getClass(), guided by the description above and your own intuition.
  4. If a method equalsis overridden in a subclass, be sure to make a callsuper.equals(o)
  5. Convert the parameter type oto the required class.
  6. Perform a comparison of all significant object fields:
    • for primitive types (except floatand double), using the operator==
    • for reference fields you need to call their methodequals
    • for arrays, you can use cyclic iteration or the methodArrays.equals()
    • for types floatand doubleit is necessary to use comparison methods of the corresponding wrapper classes Float.compare()andDouble.compare()
  7. And finally, answer three questions: is the implemented method symmetric ? Transitive ? Agreed ? The other two principles ( reflexivity and certainty ) are usually carried out automatically.

HashCode override rules

A hash is a number generated from an object that describes its state at some point in time. This number is used in Java primarily in hash tables such as HashMap. In this case, the hash function of obtaining a number based on an object must be implemented in such a way as to ensure a relatively even distribution of elements across the hash table. And also to minimize the likelihood of collisions when the function returns the same value for different keys.

Contract hashCode

To implement a hash function, the language specification defines the following rules:
  • calling a method hashCodemore than once on the same object must return the same hash value, provided that the object's fields involved in calculating the value have not changed.
  • calling a method hashCodeon two objects should always return the same number if the objects are equal (calling a method equalson these objects returns true).
  • calling a method hashCodeon two unequal objects must return different hash values. Although this requirement is not mandatory, it should be considered that its implementation will have a positive effect on the performance of hash tables.

The equals and hashCode methods must be overridden together

Based on the contracts described above, it follows that when overriding the method in your code equals, you must always override the method hashCode. Since in fact two instances of a class are different because they are in different memory areas, they have to be compared according to some logical criteria. Accordingly, two logically equivalent objects must return the same hash value. What happens if only one of these methods is overridden?
  1. equalsyes, hashCodeno

    Let's say we correctly defined a method equalsin our class, and hashCodedecided to leave the method as it is in the class Object. Then from the method's point of view equalsthe two objects will be logically equal, while from the method's point of view hashCodethey will have nothing in common. And thus, by placing an object in a hash table, we run the risk of not getting it back by key.
    For example, like this:

    Map<Point, String> m = new HashMap<>();
    m.put(new Point(1, 1),Point A);
    // pointName == null
    String pointName = m.get(new Point(1, 1));

    Obviously, the object being placed and the object being searched for are two different objects, although they are logically equal. But, because they have different hash values ​​because we violated the contract, we can say that we lost our object somewhere in the bowels of the hash table.

  2. hashCodeyes, equalsno.

    What happens if we override the method hashCodeand equalsinherit the implementation of the method from the class Object. As you know, the equalsdefault method simply compares pointers to objects, determining whether they refer to the same object. Let's assume that hashCodewe have written the method according to all the canons, namely, generated it using the IDE, and it will return the same hash values ​​for logically identical objects. Obviously, by doing so we have already defined some mechanism for comparing two objects.

    Therefore, the example from the previous paragraph should in theory be carried out. But we still won't be able to find our object in the hash table. Although we will be close to this, because at a minimum we will find a hash table basket in which the object will lie.

    To successfully search for an object in a hash table, in addition to comparing the hash values ​​of the key, the determination of the logical equality of the key with the searched object is also used. That is, equalsthere is no way to do without overriding the method.

General algorithm for determining hashCode

Here, it seems to me, you shouldn’t worry too much and generate the method in your favorite IDE. Because all these shifts of bits to the right and left in search of the golden ratio, i.e., normal distribution - this is for completely stubborn dudes. Personally, I doubt that I can do better and faster than the same Idea.

Instead of a conclusion

Thus, we see that methods equalsplay hashCodea well-defined role in the Java language and are designed to obtain the logical equality characteristic of two objects. In the case of the method, equalsthis has a direct relation to comparing objects, in the case of hashCodean indirect one, when it is necessary, let’s say, to determine the approximate location of an object in hash tables or similar data structures in order to increase the speed of searching for an object. In addition to contracts , equalsthere hashCodeis another requirement related to the comparison of objects. This is the consistency of an compareTointerface method Comparablewith a equals. This requirement obliges the developer to always return x.equals(y) == truewhen x.compareTo(y) == 0. That is, we see that the logical comparison of two objects should not contradict anywhere in the application and should always be consistent.

Sources

Effective Java, Second Edition. Joshua Bloch. Free translation of a very good book. Java, a professional's library. Volume 1. Basics. Kay Horstmann. A little less theory and more practice. But everything is not analyzed in as much detail as Bloch’s. Although there is a view on the same equals(). Data structures in pictures. HashMap An extremely useful article on the HashMap device in Java. Instead of looking at the sources.
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION