JavaRush /Java Blog /Random EN /Contracts equals and hashCode or whatever it is
Aleksandr Zimin
Level 1
Санкт-Петербург

Contracts equals and hashCode or whatever it is

Published in the Random EN group
The vast majority of Java programmers, of course, know that the methods equalsand hashCodeare closely related to each other, and that it is desirable to redefine both of these methods consistently in their classes. A slightly smaller number know why this is so and what sad consequences can be if this rule is violated. I propose to consider the concept of these methods, repeat their purpose and understand why they are so connected. I wrote this article, like the previous one about class loading, for myself in order to finally reveal all the details of the issue and no longer return to third-party sources. Therefore, I will be glad to constructive criticism, because if there are gaps somewhere, they should be eliminated. The article, alas, turned out to be quite voluminous.

equals override rules

A method equals()is needed in Java to prove or deny the fact that two objects of the same origin are logically equal . That is, when comparing two objects, the programmer needs to understand whether their meaningful fields are equivalent . Not necessarily all fields must be identical, since the method equals()implies exactly logical equality . But sometimes there is no particular need to use this method. As they say, the easiest way to avoid problems by using one mechanism or another is not to use it. It should also be noted that once violating the contractequalsyou lose control of understanding how other objects and structures will interact with your object. And then it will be very difficult to find the cause of the error.

When not to override this method

  • When each class instance is unique.
  • To a greater extent, this applies to those classes that provide certain behavior, rather than designed to work with data. Such as the class Thread. For them equals, the implementation of the method provided by the class Objectis more than sufficient. Another example is enum classes ( Enum).
  • When in fact the class is not required to determine the equivalence of its instances.
  • For example, java.util.Randomthere is no need for a class to compare instances of the class to determine if they can return the same sequence of random numbers. Simply because the nature of this class doesn't even imply such behavior.
  • When the class that you are extending already has its own implementation of the method, equalsand the behavior of this implementation suits you.
  • For example, for the classes Set, List, Mapthe implementation equalsis in AbstractSet, AbstractListand AbstractMaprespectively.
  • And finally, there is no need to override equalswhen your class's scope is privateor package-privateand you are certain that this method will never be called.

equals contract

When overriding a method, equalsthe developer must adhere to the basic rules defined in the Java Language Specification.
  • reflexivity
  • for any given value x, the expression x.equals(x)must return true.
    Assigned means thatx != null
  • Symmetry
  • for any given values ​​of xand y, x.equals(y)must return trueonly if y.equals(x)returns true.
  • Transitivity
  • for any given values x​​, yand z, if x.equals(y)returns trueand y.equals(z)returns true, x.equals(z)must return true.
  • Consistency
  • for any given values, xand ya second call x.equals(y)will return the value of the previous call to this method, provided that the fields used to compare the two objects haven't changed between calls.
  • null comparison
  • for any given value, xthe call x.equals(null)must return false.

Violation of the equals contract

Many classes, such as those from the Java Collections Framework, depend on the implementation of the method equals(), so you should not neglect it, because. violation of the contract of this method can lead to irrational operation of the application, in which case it will be quite difficult to find the reason. According to the principle of reflexivity , every object must be equivalent to itself. If this principle is violated, when adding an object to the collection and then searching for it using the method, contains()we will not be able to find the object that we just put into the collection. Symmetry conditionstates that any two objects must be equal, regardless of the order in which they are compared. For example, having a class containing only one field of the string type, it will be incorrect to compare equalsthis field with a string in the method. Because in the case of a reverse comparison, the method will always return the value false.
// Нарушение симметричности
public class SomeStringify {
    private String s;

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o instanceof SomeStringify) {
            return s.equals(((SomeStringify) o).s);
        }
        // нарушение симметричности, классы разного происхождения
        if (o instanceof String) {
            return s.equals(o);
        }
        return false;
    }
}
//Правильное определение метода equals
@Override
public boolean equals(Object o) {
    if (this == o) return true;
    return o instanceof SomeStringify &&
            ((SomeStringify) o).s.equals(s);
}
It follows from the transitivity condition that if any two of the three objects are equal, then in this case all three must be equal. This principle is easily violated when it is necessary to extend a certain base class by adding a meaningful component to it . For example, to the class Pointwith coordinates xand yyou need to add the color of the point by expanding it. To do this, you need to declare a class ColorPointwith the corresponding field color. Thus, if the equalsparent method is called in the extended class, and in the parent class we will assume that only the coordinates xandy, then two points of different colors, but with the same coordinates will be considered equal, which is wrong. In this case, it is necessary to teach the derived class to distinguish colors. To do this, you can use two methods. But one will violate the rule of symmetry , and the second - transitivity .
// Первый способ, нарушая симметричность
// Метод переопределен в классе ColorPoint
@Override
public boolean equals(Object o) {
    if (!(o instanceof ColorPoint)) return false;
    return super.equals(o) && ((ColorPoint) o).color == color;
}
In this case, the call point.equals(colorPoint)will return true, and the comparison colorPoint.equals(point)will return false, because expects an object of “its” class. Thus, the rule of symmetry is violated. The second method involves doing a “blind” check, in the case when there is no data on the color of the point, i.e. we have the class Point. Or check the color if information about it is available, i.e. compare the class object ColorPoint.
// Метод переопределен в классе ColorPoint
@Override
public boolean equals(Object o) {
    if (!(o instanceof Point)) return false;

    // Слепая проверка
    if (!(o instanceof ColorPoint))
        return super.equals(o);

    // Полная проверка, включая цвет точки
    return super.equals(o) && ((ColorPoint) o).color == color;
}
The principle of transitivity is violated here in the following way. Let's say we have the definition of the following objects:
ColorPoint p1 = new ColorPoint(1, 2, Color.RED);
Point p2 = new Point(1, 2);
ColorPoint p3 = new ColorPoint(1, 2, Color.BLUE);
Thus, even though the equality p1.equals(p2)and p2.equals(p3), p1.equals(p3)will return the value false. At the same time, the second method, in my opinion, looks less attractive, because. in some cases, the algorithm may become blind and not perform the full comparison, and you may not know about it. A bit of poetry In general, there is no specific solution to this problem, as I understand it. There is an opinion of one authoritative author named Kay Horstmann that you can replace the use of the operator instanceofwith a method call getClass()that returns the class of the object and, before starting to compare the objects themselves, make sure that they are of the same type, and ignore the fact of their common origin. Thus, the rules of symmetry and transitivitywill be completed. But at the same time, on the other side of the barricade is another author, no less respected in wide circles, Joshua Bloch, who believes that such an approach violates the Barbara Liskov substitution principle. This principle states that “the calling code should work with the base class in exactly the same way as with its subclasses, without knowing it” . And in the solution proposed by Horstmann, this principle is clearly violated, because it depends on the implementation. In short, it is clear that the matter is dark. It should also be noted that Horstmann clarifies the rule for applying his approach and writes in plain English that you need to decide on a strategy when designing classes, and if the equality test will be carried out only by the forces of the superclass, you can do this by performing the operationinstanceof. Otherwise, when the semantics of the check varies depending on the derived class and the implementation of the method needs to be moved down the hierarchy, the method must be used getClass(). Joshua Bloch, in turn, proposes to abandon inheritance and use object composition by including a class in the class ColorPointand Pointproviding an access method asPoint()for obtaining information specifically about the point. This will avoid breaking all the rules, but, in my opinion, it will make it difficult to understand the code. The third option is to use the IDE's automatic generation of the equals method. Idea, by the way, reproduces the generation according to Horstmann, and allows you to choose the strategy for implementing the method in the superclass or in its heirs. And finally, the following consistency rulestates that if the objects xand ydo not change, the second call x.equals(y)should return the same value as before. The last rule is that no object should be equal to null. Everything is clear here null- this is uncertainty, is the object of uncertainty equal? It is not clear, i.e. false.

General algorithm for determining equals

  1. Check for equality between object reference thisand method parameter o.
    if (this == o) return true;
  2. Check if the reference is defined o, i.e. if it is a null.
    If in the future when comparing object types the operator will be used instanceof, this paragraph can be skipped, because this parameter returns falsein this case null instanceof Object.
  3. Compare the types of objects thisand ousing the operator instanceofor method getClass(), guided by the description above and your own intuition.
  4. If a method equalsis overridden in a subclass, don't forget to callsuper.equals(o)
  5. Perform parameter type conversion oto the required class.
  6. Perform a comparison of all significant fields of objects:
    • for primitive types (except floatand double) using the operator==
    • for reference fields, you need to call their methodequals
    • for arrays, you can use the loop iteration, or the methodArrays.equals()
    • for types floatand doubleyou must use the comparison methods of the corresponding wrapper classes Float.compare()andDouble.compare()
  7. And, finally, to answer three questions: is the implemented method symmetric ? transitive ? Agreed ? The other two principles ( reflexivity and certainty ) are usually automatic.

hashCode override rules

A hash is a number generated from an object that describes its state at some point in time. This number is used in Java primarily in hash tables such as HashMap. In this case, the hash function for obtaining a number based on an object must be implemented in such a way as to ensure a relatively uniform distribution of elements in the hash table. And also to minimize the probability of collisions when the function returns the same value for different keys.

contract hashCode

The language specification defines the following rules for implementing the hash function:
  • calling the method hashCodeone or more times on the same object must return the same hash value, provided that the fields of the object involved in the calculation of the value have not changed.
  • calling a method hashCodeon two objects should always return the same number if those objects are equal (calling a method equalson those objects returns true).
  • calling a method hashCodeon two unequal objects must return different hash values. Although this requirement is not mandatory, it should be taken into account that the implementation of it will positively affect the performance of the hash tables.

The equals and hashCode methods need to be overridden together

Based on the contracts described above, it follows that when redefining the method in your code equals, you must always override the method hashCode. Since, in fact, two class instances are different, because they are in different areas of memory, they have to be compared according to some logical criteria. Accordingly, two logically equivalent objects must return the same hash value. What happens if only one of these methods is overridden?
  1. equalsyes, hashCodeno

    Let's say we correctly defined the method equalsin our class, and hashCodedecided to leave the method as it is in the Object. Then, from the point of view of the method, equalsthe two objects will be logically equal, while from the point of view of the method, hashCodethey will have nothing in common. And thus, by placing an object in a hash table, we risk not getting it back by key.
    For example, like this:

    Map<Point, String> m = new HashMap<>();
    m.put(new Point(1, 1),Point A);
    // pointName == null
    String pointName = m.get(new Point(1, 1));

    Obviously, the placed and sought object are two different objects, although they are logically equal. But, because they have a different hash value, because we violated the contract, we can say that we lost our object somewhere in the depths of the hash table.

  2. hashCodeis, equalsno.

    What happens if we override the method hashCodeand equalsinherit the implementation of the method from the class Object. As you know, the default method equalssimply compares object pointers to determine if they refer to the same object. Suppose that hashCodewe wrote the method according to all the canons, namely, we generated it using the IDE tools, and it will return the same hash values ​​for logically identical objects. It is obvious that by doing so we have already determined some mechanism for comparing two objects.

    Therefore, the example from the previous paragraph, in theory, should be executed. But we still can't find our object in the hash table. Although we will already be close to this, because at least we will find the hash table basket in which the object will lie.

    To successfully search for an object in a hash table, in addition to comparing the hash values ​​of the key, it is also used to determine the logical equality of the key with the desired object. That is, equalsit is impossible to do without redefining the method.

General algorithm for determining hashCode

Here, it seems to me, you should not worry much at all and perform method generation in your favorite IDE. Because all these bit shifts to the right, to the left in the search for the golden ratio, that is, the normal distribution - this is for completely stubborn dudes. Personally, I doubt that I can do better and faster than the same Idea.

Instead of a conclusion

Thus, we see that the methods equalsand hashCodeplay a well-defined role in the Java language and are intended to obtain the characteristic of the logical equality of two objects. In the case of a method, equalsthis is directly related to comparing objects, in the case of hashCodean indirect one, when it is necessary, let's say, to determine the approximate location of an object in hash tables or similar data structures in order to increase the speed of searching for an object. In addition to contracts , equalsthere hashCodeis another requirement related to the comparison of objects. It is the consistency of an compareTointerface method Comparablewith a method equals. This requirement obliges the developer to always return x.equals(y) == truewhenx.compareTo(y) == 0. That is, we see that a logical comparison of two objects should not contradict anywhere in the application and should always be consistent.

Sources

Effective Java, Second Edition. Joshua Bloch. Free translation of a very good book. Java, the library of the professional. Volume 1. Basics. Kay Horstmann. A little less theory and more practice. But everything is not as detailed as Bloch's. Although there is a view on the same equals(). Data structures in pictures. HashMap An extremely useful article on HashMap in Java. Instead of looking at the source.
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION