The class
HashSet
implements the interface Set
, is based on a hash table, and is also backed by a HashMap
. The HashSet
elements are not ordered, there is no guarantee that the elements will be in the same order over time. The add, remove, and lookup operations will take constant time, provided that the hash function allocates the elements correctly to the "baskets", which will be discussed later. A few important points about HashSet
:
- Because the class implements the interface
Set
, it can only store unique values; - Can store NULL values;
- The order in which elements are added is calculated using a hash code;
HashSet
also implements the interfacesSerializable
andCloneable
.
HashSet
must be directly proportional to the number of elements in HashSet
+ the "capacity" of the built-in instance HashMap
(the number of "bins"). Therefore, to maintain performance, it is very important not to set the initial capacity too high (or the load factor too low). Initial capacity - the initial number of cells ("baskets") in the hash table. If all cells are filled, their number will increase automatically. The load factor is a measure of how full it can be HashSet
before its capacity automatically expands. When the number of elements inHashSet
becomes larger than the product of the initial capacity and the load factor, the hash table is re-hashed (the hash codes of the elements are re-calculated, and the table is rebuilt according to the obtained values) and the number of cells in it is increased by 2 times. Load factor = Number of elements stored in the table / size of the hash table For example, if the initial number of cells in the table is 16, and the load factor is 0.75, then it follows that when the number of filled cells reaches 12, their number will automatically increase. Load factor and initial capacity are the two main factors that affect the performance of operations withHashSet
. A load factor of 0.75 provides good performance on average. If this parameter is increased, then the memory load will decrease (since it will reduce the number of re-hashing and rebuilding operations), but it will affect the append and search operations. To minimize the time spent on re-hashing, you need to choose the right initial capacity parameter. If the initial capacity is greater than the maximum number of elements divided by the load factor, then no rehash operation will occur at all. Important: HashSet
is not a data structure with built-in synchronization, so if multiple threads are working on it at the same time, and at least one of them tries to make changes, you need to provide synchronized access from the outside. This is often done at the expense of another synchronized object that encapsulates the HashSet
. If there is no such object, then the Collections.synchronizedSet()
. This is by far the best way to prevent out of sync operations with HashSet
.
Set s = Collections.synchronizedSet(new HashSet(...));
HashSet constructors:
HashSet h = new HashSet();
is the default constructor. Default initial capacity is 16, load factor is 0.75.HashSet h = new HashSet(int initialCapacity)
is a constructor with a given initial capacity. The load factor is 0.75.HashSet h = new HashSet(int initialCapacity, float loadFactor);
is a constructor with a given initial capacity and load factor.HashSet h = new HashSet(Collection C)
is a constructor that adds elements from another collection.
HashSet
:
import java.util.*;
class Test
{
public static void main(String[]args)
{
HashSet<String> h = new HashSet<String>();
// Add elements to the HashSet using the add() method
h.add("India");
h.add("Australia");
h.add("South Africa");
h.add("India");// try to add another same element
// Print the elements of the HashSet to the console
System.out.println(h);
System.out.println("List contains India or not:" +
h.contains("India"));
// Remove elements from the set using the remove() method
h.remove("Australia");
System.out.println("List after removing Australia:"+h);
// Loop through the elements of the HashSet using an iterator:
System.out.println("Iterating over list:");
Iterator<String> i = h.iterator();
while (i.hasNext())
System.out.println(i.next());
}
}
Conclusion:
[South Africa, Australia, India]
List contains India or not:true
List after removing Australia:[South Africa, India]
Iterating over list:
South Africa
India
All classes that implement interface Set
are internally supported by implementations Map
. HashSet
stores elements with HashMap
. Although HashMap
it must be represented as a key/value pair to add an element to, HashSet
only the value is added to it. In fact, the value we pass to HashSet
is the key to the object HashMap
, and the value in HashMap
is a constant. Thus, in each key-value pair, all keys will have the same value. Implementation HashSet
in java doc
:
private transient HashMap map;
// Constructor - 1
// All constructors implicitly create a HashMap object.
public HashSet()
{
// Create an implicit HashMap object
map = new HashMap();
}
// Constructor- 2
public HashSet(int initialCapacity)
{
// Create an implicit HashMap object
map = new HashMap(initialCapacity);
}
// An object of the Object class, each time acting as a value in the HashMap
private static final Object PRESENT = new Object();
If you look at the add()
y method HashSet
:
public boolean add(E e)
{
return map.put(e, PRESENT) == null;
}
You can see that the add()
y method HashSet
calls a method put()
on the internal object HashMap
, passing it the element to be added as the key, and the PRESENT constant as the value. The method works the same way remove()
. It calls the method remove()
of the internal object HashMap
:
public boolean remove(Object o)
{
return map.remove(o) == PRESENT;
}
HashSet
is based on a hash table, and an add, delete, or lookup operation will, on average, take constant (O(1)) time . Methods HashSet
:
boolean add(E e)
: adds an element toHashSet
if it doesn't exist, if it already exists, the method returns false .void clear():
removes all elements from the set.boolean contains(Object o)
: Returns true if the given element is present in the set.boolean remove(Object o)
: Removes the given element from the set, if any.Iterator iterator()
: returns an iterator for the elements of the set.boolean isEmpty()
: returns true if the set has no elements.Object clone()
: performs a shallow cloneHashSet
.
GO TO FULL VERSION