JavaRush /Java Blog /Random EN /equals & hashCode methods: practice of use

equals & hashCode methods: practice of use

Published in the Random EN group
Hello! Today we will talk about two important methods in Java - equals()and hashCode(). This is not the first time we’ve met them: at the beginning of the JavaRush course there was a short lecture about equals()- read it if you’ve forgotten it or haven’t seen it before. Methods equals &  hashCode: usage practice - 1In today’s lesson we’ll talk about these concepts in detail - believe me, there’s a lot to talk about! And before we move on to something new, let's refresh our memory on what we've already covered :) As you remember, the usual comparison of two objects using the “ ==” operator is a bad idea, because “ ==” compares references. Here is our example with cars from a recent lecture:
public class Car {

   String model;
   int maxSpeed;

   public static void main(String[] args) {

       Car car1 = new Car();
       car1.model = "Ferrari";
       car1.maxSpeed = 300;

       Car car2 = new Car();
       car2.model = "Ferrari";
       car2.maxSpeed = 300;

       System.out.println(car1 == car2);
   }
}
Console output:

false
It would seem that we have created two identical objects of the class Car: all the fields on the two machines are the same, but the result of the comparison is still false. We already know the reason: the links car1and car2point to different addresses in memory, so they are not equal. We still want to compare two objects, not two references. The best solution for comparing objects is the equals().

equals() method

You may remember that we do not create this method from scratch, but override it - after all, the method equals()is defined in the class Object. However, in its usual form it is of little use:
public boolean equals(Object obj) {
   return (this == obj);
}
This is how the method equals()is defined in the class Object. The same comparison of links. Why was he made like this? Well, how do the creators of the language know which objects in your program are considered equal and which are not? :) This is the main idea of ​​the method equals()- the creator of the class himself determines the characteristics by which the equality of objects of this class is checked. By doing this, you override the method equals()in your class. If you don’t quite understand the meaning of “you define the characteristics yourself,” let’s look at an example. Here is a simple class of person - Man.
public class Man {

   private String noseSize;
   private String eyesColor;
   private String haircut;
   private boolean scars;
   private int dnaCode;

public Man(String noseSize, String eyesColor, String haircut, boolean scars, int dnaCode) {
   this.noseSize = noseSize;
   this.eyesColor = eyesColor;
   this.haircut = haircut;
   this.scars = scars;
   this.dnaCode = dnaCode;
}

   //getters, setters, etc.
}
Let's say we are writing a program that needs to determine whether two people are related by twins, or just doppelgängers. We have five characteristics: nose size, eye color, hairstyle, presence of scars and the results of a DNA biological test (for simplicity - in the form of a code number). Which of these characteristics do you think will allow our program to identify twin relatives? Methods equals &  hashCode: usage practice - 2Of course, only a biological test can provide a guarantee. Two people can have the same eye color, hairstyle, nose, and even scars - there are many people in the world, and it is impossible to avoid coincidences. We need a reliable mechanism: only the result of a DNA test allows us to draw an accurate conclusion. What does this mean for our method equals()? We need to redefine it in a class Mantaking into account the requirements of our program. The method must compare the field of int dnaCodetwo objects, and if they are equal, then the objects are equal.
@Override
public boolean equals(Object o) {
   Man man = (Man) o;
   return dnaCode == man.dnaCode;
}
Is it really that simple? Not really. We missed something. In this case, for our objects we have defined only one “significant” field by which their equality is established - dnaCode. Now imagine that we would have not 1, but 50 such “significant” fields. And if all 50 fields of two objects are equal, then the objects are equal. This could also happen. The main problem is that calculating the equality of 50 fields is a time-consuming and resource-consuming process. Now imagine that in addition to the class, Manwe have a class Womanwith exactly the same fields as in Man. And if another programmer uses your classes, he can easily write in his program something like:
public static void main(String[] args) {

   Man man = new Man(........); //a bunch of parameters in the constructor

   Woman woman = new Woman(.........);//same bunch of parameters.

   System.out.println(man.equals(woman));
}
In this case, there is no point in checking the field values: we see that we are looking at objects of two different classes, and they cannot be equal in principle! This means that we need to place a check in the method equals()—a comparison of objects of two identical classes. It's good that we thought of this!
@Override
public boolean equals(Object o) {
   if (getClass() != o.getClass()) return false;
   Man man = (Man) o;
   return dnaCode == man.dnaCode;
}
But maybe we forgot something else? Hmm... At a minimum, we should check that we are not comparing the object with itself! If references A and B point to the same address in memory, then they are the same object, and we also don’t need to waste time comparing 50 fields.
@Override
public boolean equals(Object o) {
   if (this == o) return true;
   if (getClass() != o.getClass()) return false;
   Man man = (Man) o;
   return dnaCode == man.dnaCode;
}
In addition, it would not hurt to add a check for null: no object can be equal to null, in which case there is no point in additional checks. Taking all this into account, our equals()class method Manwill look like this:
@Override
public boolean equals(Object o) {
   if (this == o) return true;
   if (o == null || getClass() != o.getClass()) return false;
   Man man = (Man) o;
   return dnaCode == man.dnaCode;
}
We carry out all the initial checks mentioned above. If it turns out that:
  • we compare two objects of the same class
  • this is not the same object
  • we are not comparing our object withnull
...then we move on to comparing significant characteristics. In our case, the fields dnaCodeof two objects. When overriding a method equals(), be sure to comply with these requirements:
  1. Reflexivity.

    Any object must be equals()to itself.
    We have already taken this requirement into account. Our method states:

    if (this == o) return true;

  2. Symmetry.

    If a.equals(b) == true, then b.equals(a)it should return true.
    Our method also meets this requirement.

  3. Transitivity.

    If two objects are equal to some third object, then they must be equal to each other.
    If a.equals(b) == trueand a.equals(c) == true, then the check b.equals(c)should also return true.

  4. Permanence.

    The results of the work equals()should change only when the fields included in it change. If the data of two objects has not changed, the results of the check equals()should always be the same.

  5. Inequality with null.

    For any object, the check a.equals(null)must return false.
    This is not just a set of some “useful recommendations”, but a strict contract of methods , prescribed in the Oracle documentation

hashCode() method

Now let's talk about the method hashCode(). Why is it needed? Exactly for the same purpose - comparing objects. But we already have it equals()! Why another method? The answer is simple: to improve productivity. A hash function, which is represented by the , method in Java hashCode(), returns a fixed-length numeric value for any object. In the case of Java, the method hashCode()returns a 32-bit number of type int. Comparing two numbers with each other is much faster than comparing two objects using the method equals(), especially if it uses many fields. If our program will compare objects, it is much easier to do this by hash code, and only if they are equal by hashCode()- proceed to comparison by equals(). This is, by the way, how hash-based data structures work—for example, the one you know HashMap! The method hashCode(), just like equals(), is overridden by the developer himself. And just like for equals(), the method hashCode()has official requirements specified in the Oracle documentation:
  1. If two objects are equal (that is, the method equals()returns true), they must have the same hash code.

    Otherwise our methods will be meaningless. Checking by hashCode(), as we said, should come first to improve performance. If the hash codes are different, the check will return false, even though the objects are actually equal (as we defined in the method equals()).

  2. If a method hashCode()is called multiple times on the same object, it should return the same number each time.

  3. Rule 1 does not work in reverse. Two different objects can have the same hash code.

The third rule is a little confusing. How can this be? The explanation is quite simple. The method hashCode()returns int. intis a 32-bit number. It has a limited number of values ​​- from -2,147,483,648 to +2,147,483,647. In other words, there are just over 4 billion variations of the number int. Now imagine that you are creating a program to store data about all living people on Earth. Each person will have his own class object Man. ~7.5 billion people live on earth. In other words, no matter how good an algorithm Manwe write for converting objects into numbers, we simply won’t have enough numbers. We only have 4.5 billion options, and many more people. This means that no matter how hard we try, the hash codes will be the same for some different people. This situation (the hash codes of two different objects matching) is called a collision. One of the programmer's goals when overriding a method hashCode()is to reduce the potential number of collisions as much as possible. What will our method hashCode()for the class look like Man, taking into account all these rules? Like this:
@Override
public int hashCode() {
   return dnaCode;
}
Surprised? :) Unexpectedly, but if you look at the requirements, you will see that we comply with everything. Objects for which ours equals()returns true will be equal in hashCode(). If our two objects Manare equal in value equals(that is, they have the same value dnaCode), our method will return the same number. Let's look at a more complicated example. Let's say our program should select luxury cars for collector clients. Collecting is a complex thing, and there are many features to it. A car from 1963 can cost 100 times more than the same car from 1964. A red car from 1970 can cost 100 times more than a blue car of the same make from the same year. Methods equals &  hashCode: usage practice - 4In the first case, with the class Man, we discarded most of the fields (i.e., person characteristics) as insignificant and used only the field for comparison dnaCode. Here we are working with a very unique area, and there cannot be minor details! Here is our class LuxuryAuto:
public class LuxuryAuto {

   private String model;
   private int manufactureYear;
   private int dollarPrice;

   public LuxuryAuto(String model, int manufactureYear, int dollarPrice) {
       this.model = model;
       this.manufactureYear = manufactureYear;
       this.dollarPrice = dollarPrice;
   }

   //... getters, setters, etc.
}
Here, when comparing, we must take into account all fields. Any mistake can cost hundreds of thousands of dollars for the client, so it's better to be safe:
@Override
public boolean equals(Object o) {
   if (this == o) return true;
   if (o == null || getClass() != o.getClass()) return false;

   LuxuryAuto that = (LuxuryAuto) o;

   if (manufactureYear != that.manufactureYear) return false;
   if (dollarPrice != that.dollarPrice) return false;
   return model.equals(that.model);
}
In our method, equals()we did not forget about all the checks that we talked about earlier. But now we compare each of the three fields of our objects. In this program, equality must be absolute, in every field. What about hashCode?
@Override
public int hashCode() {
   int result = model == null ? 0 : model.hashCode();
   result = result + manufactureYear;
   result = result + dollarPrice;
   return result;
}
The field modelin our class is a string. This is convenient: Stringthe method hashCode()is already overridden in the class. We calculate the hash code of the field model, and to it we add the sum of the other two numeric fields. There is a little trick in Java that is used to reduce the number of collisions: when calculating the hash code, multiply the intermediate result by an odd prime number. The most commonly used number is 29 or 31. We won’t go into the details of the math right now, but for future reference, remember that multiplying intermediate results by a large enough odd number helps “spread out” the results of the hash function and end up with fewer objects with the same hashcode. For our method hashCode()in LuxuryAuto it will look like this:
@Override
public int hashCode() {
   int result = model == null ? 0 : model.hashCode();
   result = 31 * result + manufactureYear;
   result = 31 * result + dollarPrice;
   return result;
}
You can read more about all the intricacies of this mechanism in this post on StackOverflow , as well as in Joshua Bloch’s book “ Effective Java ”. Finally, there is one more important point worth mentioning. Each time when overriding equals(), hashCode()we selected certain fields of the object, which were taken into account in these methods. But can we take into account different fields in equals()and hashCode()? Technically, we can. But this is a bad idea, and here's why:
@Override
public boolean equals(Object o) {
   if (this == o) return true;
   if (o == null || getClass() != o.getClass()) return false;

   LuxuryAuto that = (LuxuryAuto) o;

   if (manufactureYear != that.manufactureYear) return false;
   return dollarPrice == that.dollarPrice;
}

@Override
public int hashCode() {
   int result = model == null ? 0 : model.hashCode();
   result = 31 * result + manufactureYear;
   result = 31 * result + dollarPrice;
   return result;
}
Here are our methods equals()for hashCode()the LuxuryAuto class. The method hashCode()remained unchanged, and equals()we removed the field from the method model. Now the model is not a characteristic for comparing two objects by equals(). But it is still taken into account when calculating the hash code. What will we get as a result? Let's create two cars and check it out!
public class Main {

   public static void main(String[] args) {

       LuxuryAuto ferrariGTO = new LuxuryAuto("Ferrari 250 GTO", 1963, 70000000);
       LuxuryAuto ferrariSpider = new LuxuryAuto("Ferrari 335 S Spider Scaglietti", 1963, 70000000);

       System.out.println("Are these two objects equal to each other?");
       System.out.println(ferrariGTO.equals(ferrariSpider));

       System.out.println("What are their hash codes?");
       System.out.println(ferrariGTO.hashCode());
       System.out.println(ferrariSpider.hashCode());
   }
}

Эти два an object равны друг другу?
true
Какие у них хэш-codeы?
-1372326051
1668702472
Error! By using different fields for equals()and hashCode()we violated the contract established for them! Two equal equals()objects must have the same hash code. We got different meanings for them. Such errors can lead to the most incredible consequences, especially when working with collections that use hashes. Therefore, when redefining equals()and hashCode()it will be correct to use the same fields. The lecture turned out to be quite long, but today you learned a lot of new things! :) It's time to get back to solving problems!
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION