The class
HashSet
implements the interface Set
, is based on a hash table, and is also backed by an instance HashMap
. Since HashSet
the elements are not ordered, there is no guarantee that the elements will be in the same order after some time. The operations of adding, deleting and searching will be performed in constant time, provided that the hash function correctly distributes elements into “buckets”, which will be discussed later. A few important points about HashSet
:
- Because the class implements the interface
Set
, it can only store unique values; - Can store NULL values;
- The order in which elements are added is calculated using a hash code;
HashSet
also implements theSerializable
andCloneable
.
HashSet
, must be directly proportional to the number of elements in HashSet
+ the “capacity” of the built-in instance HashMap
(the number of “baskets”). Therefore, to maintain performance, it is important not to set the initial capacity too high (or the load factor too low). Initial capacity – the initial number of cells (“bins”) in the hash table. If all cells are filled, their number will increase automatically. Load factor is a measure of how full it can be HashSet
before its capacity automatically increases. When the number of elements in HashSet
becomes greater than the product of the initial capacity and the load factor, the hash table is re-hashed (the hash codes of the elements are recalculated and the table is rebuilt according to the obtained values) and the number of cells in it is doubled. Load factor = Number of elements stored in the table / hash table size For example, if the initial number of cells in the table is 16, and the load factor is 0.75, then it follows that when the number of filled cells reaches 12, the number of cells will automatically increase. Load factor and initial capacity are the two main factors that affect the performance of HashSet
. A load factor of 0.75 provides good performance on average. If this parameter is increased, then the memory load will be reduced (as it will reduce the number of re-hashes and rebuilds), but it will affect append and lookup operations. To minimize the time spent on re-hashing, you need to choose the right initial capacity parameter. If the initial capacity is greater than the maximum number of elements divided by the load factor, then no re-hashing operation will occur at all. Important: HashSet
is not a data structure with built-in synchronization, so if multiple threads are working on it at the same time, and at least one of them is trying to make changes, it is necessary to provide synchronized access from the outside. This is often done at the expense of another synchronized object encapsulating the HashSet
. If there is no such object, then the Collections.synchronizedSet()
. This is currently the best way to prevent out-of-sync operations with HashSet
.
Set s = Collections.synchronizedSet(new HashSet(...));
HashSet constructors:
HashSet h = new HashSet();
- default constructor. The default initial capacity is 16, load factor is 0.75.HashSet h = new HashSet(int initialCapacity)
– a constructor with a given initial capacity. Load factor – 0.75.HashSet h = new HashSet(int initialCapacity, float loadFactor);
— a constructor with a given initial capacity and load factor.HashSet h = new HashSet(Collection C)
– a constructor that adds elements from another collection.
HashSet
:
import java.util.*;
class Test
{
public static void main(String[]args)
{
HashSet<String> h = new HashSet<String>();
// Add elements to the HashSet using the add() method
h.add("India");
h.add("Australia");
h.add("South Africa");
h.add("India");// try to add another same element
// Print the elements of the HashSet to the console
System.out.println(h);
System.out.println("List contains India or not:" +
h.contains("India"));
// Remove elements from the set using the remove() method
h.remove("Australia");
System.out.println("List after removing Australia:"+h);
// Loop through the elements of the HashSet using an iterator:
System.out.println("Iterating over list:");
Iterator<String> i = h.iterator();
while (i.hasNext())
System.out.println(i.next());
}
}
Conclusion:
[South Africa, Australia, India]
List contains India or not:true
List after removing Australia:[South Africa, India]
Iterating over list:
South Africa
India
All classes that implement an interface Set
are internally supported by implementations Map
. HashSet
stores elements using HashMap
. Although HashMap
an element must be represented as a key-value pair to add an element to, HashSet
only the value is added to. In fact, the value we pass to HashSet
is the key to the object HashMap
, and a constant is used as the value HashMap
. This way, in every key-value pair, all the keys will have the same value. Implementation HashSet
in java doc
:
private transient HashMap map;
// Constructor - 1
// All constructors implicitly create a HashMap object.
public HashSet()
{
// Create an implicit HashMap object
map = new HashMap();
}
// Constructor- 2
public HashSet(int initialCapacity)
{
// Create an implicit HashMap object
map = new HashMap(initialCapacity);
}
// An object of the Object class, each time acting as a value in the HashMap
private static final Object PRESENT = new Object();
If you look at the method add()
y HashSet
:
public boolean add(E e)
{
return map.put(e, PRESENT) == null;
}
You can notice that the method add()
y HashSet
calls the method put()
on the internal object HashMap
, passing it the element to be added as the key, and the PRESENT constant as the value. The method works in a similar way remove()
. It calls the remove()
internal object method HashMap
:
public boolean remove(Object o)
{
return map.remove(o) == PRESENT;
}
HashSet
is based on a hash table, and add, delete, or lookup operations will, on average, complete in constant (O(1)) time . Methods HashSet
:
boolean add(E e)
: adds an element toHashSet
, if there is none, but if such an element is already present, the method returns false .void clear():
removes all elements from a set.boolean contains(Object o)
: Returns true if the given element is present in the set.boolean remove(Object o)
: Removes the given element from the set, if present.Iterator iterator()
: Returns an iterator for the elements of the set.boolean isEmpty()
: Returns true if there are no elements in the set.Object clone()
: Performs surface cloningHashSet
.
GO TO FULL VERSION