JavaRush /Java Blog /Random EN /Briefly about the main thing - Java Collections Framework...

Level 3

8 August 2023
232 views
0 comments

Briefly about the main thing - Java Collections Framework

The beginning of the way

Today I would like to talk about such an interesting topic as the “ Java Collections Framework ” or, in simple terms, about collections. Most of the code's work is processing data in one form or another. Get a list of users, get a list of addresses, etc. Somehow sort them, perform a search, compare them. This is why knowledge of collections is considered a core skill. That's why I want to talk about it. In addition, one of the most common questions in Java developer interviews is collections. For example, "draw a hierarchy of collections." The online compiler will help us on our way. For example, you can use " Tutorialspoint Online Java Compiler " or Repl.it. The path to getting to know any data structures begins with ordinary variables (Variables). On the Oracle website, various topics are represented as "paths" or Trails. So, the path to getting to know Java is called “ Trail: Learning the Java Language: Table of Contents ”. And the basics of the language (i.e. Language Basics) start with Variables. Therefore, let's write a simple code:

public static void main(String[] args) {
	String user = "Max";
	System.out.println("Hello, " + user);
}

It is good in everything, except that we understand that this code is good and beautiful only for one variable. What to do if there are several of them? Arrays were invented to store data of one type. In the same Trail from Oracle there is a separate section dedicated to arrays. This section is called: " Arrays ". Working with arrays is also quite simple:

import java.util.Arrays;
class Main {
  public static void main(String[] args) {
    String[] users = new String[2];
    users[0] = "Max";
    users[1] = "John";
    System.out.println("Hello, " + Arrays.toString(users));
  }
}

Arrays solve the problem of storing multiple values in one place. But it imposes a limitation: the array size is constant. If, as in the example, we said that size = 2, then it is equal to two. That's all. If we want a larger array, we need to create a new instance. Also, finding an element is also a tricky thing for an array. There is a method Arrays.binarySearch, but this search only works on a sorted array (for an unsorted array, the result is undefined or simply unpredictable). That is, the search will oblige us to sort each time. Deleting also just clears the value. Therefore, we still don’t know how much data is actually in the array, we only know how many cells are in the array. To refresh your knowledge about arrays:

" JavaRush: Arrays in Java "
" The Java Tutorial: Arrays "

And as a consequence of the development of the Java language, the Java Collections Framework appeared in JDK 1.2, which we will talk about today.

Collection

Start costing from the very beginning. Why Collections? The term itself comes from things like "Type Theory" and "Abstract Data Types". But if you don’t look at any high matters, then when we have several things, we can call them a “collection of things.” Those who collect items. In general, the word collect itself comes from Lat. collectio "gathering, collecting." That is, a collection is a collection of something, a container for some elements. So we have a collection of elements. What we might want to do with it:

As you can see, we may want quite logical things. We also understand that we may want to do something with multiple collections:

Briefly about the main thing - Java Collections Framework - 4

Accordingly, the Java developers wrote the java.util.Collection interface to describe this general behavior for all collections . The Collection interface is where all collections originate. Collection is an idea, it is an idea of how all collections should behave. Therefore, the term "Collection" is expressed as an interface. Naturally, an interface needs implementations. The interface java.util.Collectionhas an abstract class AbstractCollection, that is, some “abstract collection”, which represents the skeleton for other implementations (as written in the JavaDoc above the class java.util.AbstractCollection). Speaking about collections, let's go back and remember that we want to iterate over them. This means that we want to iterate through the elements one by one. This is a very important concept. Therefore, the interface Collectionis inherited from Iterable. This is very important because... firstly, everything Iterable must be able to return an Iterator based on its contents. And secondly, everything that Iterable can be used in loops for-each-loop. And it is with the help of an iterator AbstractCollectionthat methods such as contains, toArray, are implemented remove. And the path to understanding collections begins with one of the most common data structures - a list, i.e. List.

Briefly about the main thing - Java Collections Framework - 5

Lists

So, lists occupy an important place in the hierarchy of collections:

Briefly about the main thing - Java Collections Framework - 6

As we can see, lists implement the java.util.List interface . Lists express that we have a collection of elements that are arranged in some sequence one after another. Each element has an index (like in an array). Typically, a list allows you to have elements with the same value. As we said above, Listit knows about the index of the element. This allows you to get ( get) an element by index or set a value for a specific index ( set). Collection methods add, addAll, removeallow you to specify the index from which to execute them. Additionally, y Listhas its own version of an iterator called ListIterator. This iterator knows about the index of the element, so it can iterate not only forward, but also backward. It can even be created from a specific place in the collection. Among all the implementations, there are two most commonly used: ArrayListand LinkedList. First, ArrayListit is a list ( List) based on an array ( Array). This allows you to achieve "Random Access" to elements. Random access is the ability to immediately retrieve an element by index, rather than iterate through all the elements until we find the element with the desired index. It is the array as a basis that allows this to be achieved. On the contrary, LinkedListit is a Linked List. Each entry in a linked list is represented in the form Entry, which stores the data itself, as well as a link to the next (next) and previous (previous) Entry. Thus LinkedListimplements "Sequential Access " . It is clear that to find the 5th element we will have to go from the first element to the last, because we don't have direct access to the fifth element. We can only access it from the 4th element. The difference in their concept is given below:

Briefly about the main thing - Java Collections Framework - 7

In work, as you understand, there is also a difference. For example, adding elements. The LinkedListelements are simply connected like links in a chain. But ArrayListit stores elements in an array. And an array, as we know, cannot change its size. How does it work then ArrayList? And it works very simply. When the space in the array runs out, it increases by 1.5 times. Here is the zoom code: int newCapacity = oldCapacity + (oldCapacity >> 1); Another difference in operation is any offset of elements. For example, when adding or removing elements to the middle. To remove from LinkedListan element, simply remove references to this element. In the case of , ArrayListwe are forced to shift the elements each time using the System.arraycopy. Thus, the more elements, the more actions will have to be performed. A more detailed description can be found in these articles:

Having examined ArrayList, one cannot help but remember its “predecessor”, the java.util.Vector class . It differs Vectorin ArrayListthat the methods for working with the collection (adding, deleting, etc.) are synchronized. That is, if one thread ( Thread) adds elements, then other threads will wait until the first thread finishes its work. Since thread safety is often not required, it is recommended to use the class in such cases ArrayList, as explicitly stated in the JavaDoc for the class Vector. In addition, Vectorit increases its size not by 1.5 times, ArrayListbut by 2 times. Otherwise, the behavior is the same - Vectorthe storage of elements in the form of an array is hidden and adding/removing elements has the same consequences as in ArrayList. In fact, Vectorwe remembered this for a reason. If we look in the Javadoc, we will see in "Direct Known Subclasses" a structure such as java.util.Stack . The stack is an interesting structure that is a LIFO last-in-first-out(last in, first out) structure. Stack translated from English is a stack (like a stack of books, for example). The stack implements additional methods: peek(look, look), pop(push), push(push). The method peekis translated as look (for example, peek inside the bag is translated as “ look inside the bag ”, and peek through the keyhole is translated as “ peek through the keyhole ”). This method allows you to look at the “top” of the stack, i.e. get the last element without removing (i.e. without removing) it from the stack. The method pushpushes (adds) a new element onto the stack and returns it, and the element method poppushes (removes) and returns the removed one. In all three cases (i.e. peek, pop and push), we only work with the last element (i.e. the “top of the stack”). This is the main feature of the stack structure. By the way, there is an interesting task to understand stacks, described in the book “A Programmer’s Career” (Cracking Coding Interview). There is an interesting task where using the “stack” structure (LIFO) you need to implement the “queue” structure (FIFO). It should look like this:

Briefly about the main thing - Java Collections Framework - 8

Analysis of this task can be found here: " Implement A Queue Using Stacks - The Queue ADT ("Implement Queue Using Stacks" on LeetCode) ". So we smoothly move on to a new data structure - a queue.

Briefly about the main thing - Java Collections Framework - 9

Queue

A Queue is a structure familiar to us from life. Queues to shops, to doctors. Whoever came first (First In) will be the first to leave the queue (First Out). In Java, a queue is represented by the java.util.Queue interface . According to the queue's Javadoc, the queue adds the following methods:

Briefly about the main thing - Java Collections Framework - 10

As you can see, there are order methods (failure to execute them is fraught with an exception) and there are request methods (the inability to execute them does not lead to errors). It is also possible to get the element without removing it (peek or element). The queue interface also has a useful successor - Deque . This is the so-called "two-way queue". That is, such a queue allows you to use this structure both from the beginning and from the end. The documentation says that "Deques can also be used as LIFO (Last-In-First-Out) stacks. This interface should be used in preference to the legacy Stack class.", that is, it is recommended to use Deque implementations instead of Stack. The Javadoc shows what methods the Deque interface describes:

Briefly about the main thing - Java Collections Framework - 11

Let's see what implementations there are. And we’ll see an interesting fact - LinkedList has gotten into the queue camp) That is, LinkedList implements both List, and Deque. But there are also “queues only”, for example PriorityQueue. She is not often remembered, but in vain. Firstly, you cannot use "non-comparable objects" in this queue, i.e. either Comparator must be specified or all objects must be comparable. Secondly, "this implementation provides O(log(n)) time for the enqueuing and dequeuing methods". Logarithmic complexity is there for a reason. Implemented PriorityQueue based on the heap. The Javadoc says: "Priority queue represented as a balanced binary heap." The storage itself for this is a regular array. Which grows when necessary. When the heap is small, it increases 2 times. And then by 50%. Comment from the code: "Double size if small; else grow by 50%". Priority queue and Binary Heap are a separate topic. So for more information:

Super article on Habré: " Data structures. Informal guide "
A clear explanation about the "heap": " Foxford: Computer Science. Data structures: Heap "
Article from Habr about Binary Heap: " Data structures: binary heap "
" Introduction to Binary Heaps (MaxHeaps) "

As an implementation, java.util.Dequeyou can use the java.util.ArrayDeque class . That is, lists can be implemented using a linked list and an array, and queues can also be implemented using an array or using a linked list. QueueThe and interfaces Dequehave descendants representing the "blocking queue": BlockingQueueand BlockingDeque. Here is the interface change compared to regular queues:

Briefly about the main thing - Java Collections Framework - 12

Let's look at some examples of blocking queues. But they are interesting. For example, BlockingQueue is implemented by: PriorityBlockingQueue , SynchronousQueue , ArrayBlockingQueue, DelayQueue , LinkedBlockingQueue . But BlockingDequethey implement everything from the standard Collection Frameworks LinkedBlockingDeque. Each queue is the topic of a separate review. And within the framework of this review, we will depict the class hierarchy not only with List, but also with Queue:

Briefly about the main thing - Java Collections Framework - 13

As we can see from the diagram, the interfaces and classes of the Java Collections Framework are highly intertwined. Let's add another branch of the hierarchy - Set.

Briefly about the main thing - Java Collections Framework - 14

Set

Set— translated as “set.” It differs from a queue and a list Setin its greater abstraction over the storage of elements. Set- like a bag of objects, where it is unknown how the objects are located and in what order they lay. In Java, such a set is represented by the java.util.Set interface . As the documentation says, Setthis is a "collection that contains no duplicate elements". Interestingly, the interface itself Setdoes not add new methods to the interface Collection, but only clarifies the requirements (about what should not contain duplicates). In addition, from the previous description it follows that you cannot simply Setget an element from it. Iterator is used to get elements. Sethas several more interfaces associated with it. The first one is SortedSet. As the name suggests, SortedSetit indicates that such a set is sorted, and therefore the elements implement the interface Comparableor are specified Comparator. In addition, SortedSetit offers several interesting methods:

Briefly about the main thing - Java Collections Framework - 15

In addition, there are methods first(smallest element by value) and last(largest element by value). There SortedSetis an heir - NavigableSet. The purpose of this interface is to describe the navigation methods needed to more accurately identify appropriate elements. An interesting thing is NavigableSetthat it adds to the usual iterator (which goes from smallest to largest) an iterator for the reverse order - descendingIterator. In addition, NavigableSetit allows you to use the method descendingSetto obtain a view of yourself (View), in which the elements are in reverse order. This is called Viewbecause through the resulting element you can change the elements of the original one Set. That is, in essence, it is a representation of the original data in a different way, and not a copy of it. Interestingly, NavigableSet, like Queue, can handle pollFirst(minimal) and pollLast(maximal) elements. That is, it gets this element and removes it from the set. What kind of implementations are there? Firstly, the most famous implementation is based on a hash code - HashSet . Another equally well-known implementation is based on a red-black tree - TreeSet . Let's complete our diagram:

Briefly about the main thing - Java Collections Framework - 16

Within the collections, it remains to sort out the hierarchy - the hermits. Which at first glance stands aside - java.util.Map.

Briefly about the main thing - Java Collections Framework - 17

Maps

Maps are a data structure in which data is stored by key. For example, the key could be an ID or city code. And it is by this key that the data will be searched. It is interesting that the cards are displayed separately. According to the developers (see " Java Collections API Design FAQ "), key-value mapping is not a collection. And maps can more quickly be thought of as a collection of keys, a collection of values, a collection of key-value pairs. This is such an interesting animal. What methods do cards provide? Let's look at the Java API interface java.util.Map . Because Since maps are not collections (they do not inherit from Collections), they do not contain a contains. And this is logical. A map consists of keys and values. Which of these should the method check containsand how not to get confused? Therefore, the interface Maphas two different versions: containsKey(contains a key) and containsValue(contains a value). Using it keySetallows you to get a set of keys (the same one Set). And using the method valueswe can get a collection of values in the map. The keys in the map are unique, which is emphasized by the data structure Set. Values can be repeated, which is emphasized by the Collection data structure. In addition, using the method entrySetwe can obtain a set of key-value pairs. You can read more about what card implementations there are in the most detailed analyzes:

I would also like to see what is HashMapvery similar to HashSet, and TreeMapto TreeSet. They even have similar interfaces: NavigableSetand NavigableMap, SortedSetand SortedMap. So our final map will look like this:

Briefly about the main thing - Java Collections Framework - 18

We can end with the interesting fact that the collection Setinternally uses Map, where the added values are keys, and the value is the same everywhere. This is interesting because it Mapis not a collection and returns Set, which is a collection but is in fact implemented as Map. A little surreal, but that’s how it turned out)

Briefly about the main thing - Java Collections Framework - 19

Conclusion

The good news is that this ends this review. The bad news is that this is a very review article. Each implementation of each of the collections deserves a separate article, and also for each algorithm hidden from our eyes. But the purpose of this review is to remember what they are and what the connections are between the interfaces. I hope that after thoughtful reading you will be able to draw a diagram of the collections from memory.

Well, as usual, some links:

#Viacheslav

Comments

TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION