HashSet in Java

The class HashSetimplements the interface Set, is based on a hash table, and is also backed by an instance HashMap. Since HashSetthe elements are not ordered, there is no guarantee that the elements will be in the same order after some time. The operations of adding, deleting and searching will be performed in constant time, provided that the hash function correctly distributes elements into “buckets”, which will be discussed later. HashSet in Java - 1

A few important points about HashSet:

Because the class implements the interface Set, it can only store unique values;
Can store NULL values;
The order in which elements are added is calculated using a hash code;
HashSetalso implements the Serializableand Cloneable.

To maintain a constant execution time of operations, the time spent on actions with HashSet, must be directly proportional to the number of elements in HashSet+ the “capacity” of the built-in instance HashMap(the number of “baskets”). Therefore, to maintain performance, it is important not to set the initial capacity too high (or the load factor too low). Initial capacity – the initial number of cells (“bins”) in the hash table. If all cells are filled, their number will increase automatically. Load factor is a measure of how full it can be HashSetbefore its capacity automatically increases. When the number of elements in HashSetbecomes greater than the product of the initial capacity and the load factor, the hash table is re-hashed (the hash codes of the elements are recalculated and the table is rebuilt according to the obtained values) and the number of cells in it is doubled. Load factor = Number of elements stored in the table / hash table size For example, if the initial number of cells in the table is 16, and the load factor is 0.75, then it follows that when the number of filled cells reaches 12, the number of cells will automatically increase. Load factor and initial capacity are the two main factors that affect the performance of HashSet. A load factor of 0.75 provides good performance on average. If this parameter is increased, then the memory load will be reduced (as it will reduce the number of re-hashes and rebuilds), but it will affect append and lookup operations. To minimize the time spent on re-hashing, you need to choose the right initial capacity parameter. If the initial capacity is greater than the maximum number of elements divided by the load factor, then no re-hashing operation will occur at all. Important: HashSetis not a data structure with built-in synchronization, so if multiple threads are working on it at the same time, and at least one of them is trying to make changes, it is necessary to provide synchronized access from the outside. This is often done at the expense of another synchronized object encapsulating the HashSet. If there is no such object, then the Collections.synchronizedSet(). This is currently the best way to prevent out-of-sync operations with HashSet.

Set s = Collections.synchronizedSet(new HashSet(...));

HashSet constructors:

HashSet h = new HashSet(); - default constructor. The default initial capacity is 16, load factor is 0.75.
HashSet h = new HashSet(int initialCapacity)– a constructor with a given initial capacity. Load factor – 0.75.
HashSet h = new HashSet(int initialCapacity, float loadFactor);— a constructor with a given initial capacity and load factor.
HashSet h = new HashSet(Collection C)– a constructor that adds elements from another collection.

The code below demonstrates some methods HashSet:

import java.util.*;

class Test
{
    public static void main(String[]args)
    {
        HashSet<String> h = new HashSet<String>();

        // Add elements to the HashSet using the add() method
        h.add("India");
        h.add("Australia");
        h.add("South Africa");
        h.add("India");// try to add another same element

        // Print the elements of the HashSet to the console
        System.out.println(h);
        System.out.println("List contains India or not:" +
                           h.contains("India"));

        // Remove elements from the set using the remove() method
        h.remove("Australia");
        System.out.println("List after removing Australia:"+h);

        // Loop through the elements of the HashSet using an iterator:
        System.out.println("Iterating over list:");
        Iterator<String> i = h.iterator();
        while (i.hasNext())
            System.out.println(i.next());
    }
}

Conclusion:

[South Africa, Australia, India]
List contains India or not:true
List after removing Australia:[South Africa, India]
Iterating over list:
South Africa
India

All classes that implement an interface Setare internally supported by implementations Map. HashSetstores elements using HashMap. Although HashMapan element must be represented as a key-value pair to add an element to, HashSetonly the value is added to. In fact, the value we pass to HashSetis the key to the object HashMap, and a constant is used as the value HashMap. This way, in every key-value pair, all the keys will have the same value. Implementation HashSetin java doc:

private transient HashMap map;

// Constructor - 1
// All constructors implicitly create a HashMap object.
public HashSet()
{
    // Create an implicit HashMap object
    map = new HashMap();
}

// Constructor- 2
public HashSet(int initialCapacity)
{
    // Create an implicit HashMap object
    map = new HashMap(initialCapacity);
}

// An object of the Object class, each time acting as a value in the HashMap
private static final Object PRESENT = new Object();

If you look at the method add()y HashSet:

public boolean add(E e)
{
   return map.put(e, PRESENT) == null;
}

You can notice that the method add()y HashSetcalls the method put()on the internal object HashMap, passing it the element to be added as the key, and the PRESENT constant as the value. The method works in a similar way remove(). It calls the remove()internal object method HashMap:

public boolean remove(Object o)
{
  return map.remove(o) == PRESENT;
}

HashSetis based on a hash table, and add, delete, or lookup operations will, on average, complete in constant (O(1)) time . Methods HashSet:

boolean add(E e): adds an element to HashSet, if there is none, but if such an element is already present, the method returns false .
void clear():removes all elements from a set.
boolean contains(Object o): Returns true if the given element is present in the set.
boolean remove(Object o): Removes the given element from the set, if present.
Iterator iterator(): Returns an iterator for the elements of the set.
boolean isEmpty(): Returns true if there are no elements in the set.
Object clone(): Performs surface cloning HashSet.

Javadoc: https://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html

Comments

TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION