Interview Questions: Java Data Structures. Part 2

PART 1 Now we are talking about the basics that every Java developer should know. About those classical knowledge with which it all begins. Today I would like to touch on one of the fundamental topics of any interview - data structures in Java . So, instead of beating around the bush, we'll get started. Catch the continuation of the list of questions that you can be asked on this topic at the interview.

6. Talk about List

List is an interface that represents an ordered structure of objects, which is called a list. Interview Questions: Data Structures in Java - 5

The “trick” of this structure is that the elements contained in the List can be inserted, modified or deleted by index, that is, the internal identifier of the List . In other words, the index means: "how many elements from the beginning of the list". The first element of the List has index 0, the second element has index 1, and so on. Thus, the fifth element is four elements away from the beginning of the list. As mentioned above, the order in which items are added to a list is important. Therefore, the data structure is called a list . We list the methods unique to this structure that are aimed at working with elements by index:

get - returns the element at the specified position (by index value),
remove - removes the element at the specified position,
set - replaces the element at the specified position with the element specified in the method.

The main implementations are ArrayList and LinkedList . We'll talk more about them a little later. Vector is a list that is thread-safe, so every method in this class is synchronized. But keep in mind that if you want to secure some list operations, you will synchronize the whole sequence of operations. And synchronization of individual operations is both less secure and much slower. Of course, Vector also has the overhead of a lock, even if you don't need the lock. Therefore, this class is now considered obsolete and is not used. By the way: ArrayList is analogous to Vector, but does not use a lock, so it is used everywhere. Stack is a subclass of the Vector class with one default constructor and all the methods of the Vector class , as well as several of its own (we'll talk about them a little later). As an example, think of a process as a stack of document folders. You put one folder at a time on top of the pile, and you can take these folders only in reverse order, starting from the top. Actually, this is the mechanism of the LIFO type , that is, Last In First Out , the last one came - the first one left. The stack implements its own methods:

push - adds the passed element to the top of the stack;
peek - returns the element that is at the top of the stack;
pop - also returns the element that is at the top of the stack, but removes it;
empty - checks if the stack is empty - true , or not - false ;
search - searches the stack for the given element. If an element is found, its ordinal relative to the top of the stack is returned. If the element is not found, -1 is returned.

At the moment, the Stack subclass is not actually used due to its simplicity and inflexibility, but, nevertheless, you may encounter it. For example, when you get some error, and in the console you see a stack of messages about it. You can read more about the stack and queue in this article .

7. Tell me about Map

As mentioned above, Map is a collection that has a separate structure of interfaces and their implementations. It is separate because here the values are not stored one at a time, but in a key-value pair. Map Interview Questions: Data Structures in Java - 6

Interview Questions: Data Structures in Java - 6

main methods :

put(K key, V value) - adding an element to the Map;
get(Object key) - search for a value by key;
containsKey(Object key) - check Map for the presence of the given key;
containsValue(Object value) - check Map for the presence of the given value;
remove(Object key) - remove a value by its key.

As you can see, most operations work by using a key. Immutable objects are usually chosen as keys . A typical example of this object is String . The main implementations of Map :

HashMap - designed to store values in an arbitrary order, but allows you to quickly look up map elements. Allows you to set the key with the keyword null , but not more than once, because pairs with the same keys are written on top of each other. The main condition is the uniqueness of the keys: values can be repeated (there can be several null values).
LinkedHashMap is similar to HashMap , which stores values in the order they were added. Accordingly, like LinkedList , it has a header - the head of a doubly linked list. When initialized, it points to itself.
LinkedHashMap also has accessOrder , which specifies how the elements will be accessed while using the iterator. If accessOrder is false, access will be done in the order in which the elements are inserted. If set to true, the elements will be in last accessed order (the last accessed element will be placed at the end).
TreeMap is a Map that sorts elements by key values. Similar to TreeSet , but for key-value pairs. To specify sorting rules for a TreeMap, keys must implement the Comparable interface. Otherwise, there must be a Comparator oriented to keys (the one that is set in the TreeMap constructor ), TreeSet - implemented with a TreeMap object inside, in which, in fact, all the magic happens.
You can read more about sorting in TreeMap using red-black trees in the article about TreeMap features .
Hashtable - Similar to HashMap , but does not allow nulls to be stored either as keys or as values. It is carefully synchronized in terms of multithreading, which in turn means that it is safe in terms of multithreading. But this implementation is outdated and slow, so now you will not meet Hashtable in more or less new projects.

8. ArrayList vs LinkedList. Which one is preferable to use?

This question is perhaps the most popular one on data structures and carries some pitfalls. Before answering it, let's learn more about these data structures. ArrayList implements the List interface , and operates on an internal array that expands as needed. When the internal array is completely filled, and a new element needs to be inserted, a new array is created, with a size of (oldSize * 1.5) +1. After that, all data from the old array is copied to the new + new element, the old one will be removed by the garbage collector . add methodadds an element to the last empty cell in the array. That is, if we already have 3 elements there, it will add the next one to the 4th cell. Let's walk through the performance of the basic methods:

get(int index) - taking an element in an array by index is the fastest in O(1) ;
add(Object obj) - if there is enough space in the internal array for a new element, then O(1) time will be spent during a normal insertion , since the addition goes purposefully to the last cell.
If you need to create a new array and copy the contents into it, then the time we will have is directly proportional to the number of elements in the array O(n) ;
remove (int index) - when removing an element, for example, from the middle, we will get O (n / 2) time, since we will need to move the elements to the right of it one cell back. Accordingly, if deleting from the beginning of the list, then O (n), from the end - O (1);
add(int index, Object obj) - a situation similar to deletion: when adding to the middle, we will need to move the elements from the right one cell forward, so the time is O (n / 2). Of course, from the beginning - O(n), from the end - O(1);
set(int index, Object obj) - the situation is different here, since you only need to find the desired element and write over it without moving the rest, so O (1).

Learn more about ArrayList in this article . LinkedList implements two interfaces at once - List and Queue , and therefore owns the properties and methods inherent in both data structures. From List, he took access to the element by index, from Queue - the presence of a “head” and “tail”. Internally, it is implemented as a data structure representing a doubly linked list. That is, each element has a link to the next and previous, except for the “tail” and “head”.

get(int index) - when searching for an element that is in the middle of the list, it starts searching through all the elements in order until the desired one is found. Logically, the search should take O(n/2) , but LinkedList also has a tail, so the search is carried out simultaneously from both sides. Accordingly, the time is reduced to O(n/4) .
If the element is not far from the beginning of the list or the end, then the time will be O(1) ;
add(Object obj) - when a new element is added, a link to the next element will be added to the “tail” element, and the new one will receive a link to this previous element and become a new “tail”. Accordingly, the time will be O(1) ;
remove(int index) - similar logic to the get(int index) method . To remove an element from the middle of the list, it must first be found. This is again O(n/4) , while the deletion itself actually takes nothing, since the pointer of neighboring objects only changes there (they begin to refer to each other). If the element is at the beginning or at the end, then again - O(1) ;
add(int index, Object obj) and set(int index, Object obj) - the time complexity of the methods will be identical to get(int index) , since most of the time is spent searching for the element. Therefore, for the middle of the list - O(n/4) , for the beginning - O(1).

Learn more about working with LinkedList in this article . Let's look at all this in a table:

Operation	ArrayList	LinkedList
Get by index get(index)	O(1)	In the middle O(n/4)
Add new element add(obj)	O(1) If you need to copy the array then - O (n)	O(1)
Remove element remove(int index)	From the beginning - O(n) From the middle - O(n/2) From the end - O(1)	Middle - O(n/4) At the end or at the beginning - O(n)
Add element add(int index, Object obj)	To the beginning - O(n) To the middle - O(n/2) End - O(1)	Middle - O(n/4) At the end or at the beginning - O(n)
Replace element set(index, obj)	O(1)	Middle - O(n/4) At the end or at the beginning - O(n)

As you probably already understood, it is impossible to answer this question unambiguously. After all, in different situations they work at different speeds. Therefore, when you are asked a similar question, you should immediately ask what this list will focus on and what operations will be most often performed. Already starting from this, give an answer, but with explanations why this is so. Let's summarize our comparison: ArrayList:

the best choice if the most frequent operation is element search, element overwriting;
the worst choice if the operation is insertion and deletion in the beginning-middle, because elements shift operations will take place from the right.

LinkedList:

the best choice if our frequent operation is insertion and deletion in the beginning-middle;
the worst choice if the most frequent operation is a search.

9. How are elements stored in a HashMap?

The HashMap collection contains an internal array Node [] table , whose cells are also called buckets (baskets). Node contain:

key - link to the key,
value — reference to value,
hash - hash value,
next - a link to the next Node .

One cell of the table[] array can contain a reference to a Node object with a link to the next Node element , and it can have a link to another, and so on ... As a result, these Node elements can form a singly linked list , with elements with a link to the next. At the same time, the hash value of the elements of the same chain is the same. After a short digression, let's see how the elements are stored in the HashMap :

The key is checked against null . If it's null , then key will be stored in table[0] , because the hash code for null is always 0.
If the key is not null , then the hashcode() method is called on the key object , which will return its hash code. This hash code is used to determine the cell in the array where the Node object will be stored.
Next, this hashcode is placed in the internal hash() method , which calculates the hashcode, but already within the size of the table[] array .
Further, depending on the hash value, the Node is placed in a specific cell in the table[] array .
If the table[] cell used to store the current Node element is not empty, but already has some element, then the Node elements are iterated by the next value until the last element is reached. That is, the one whose next field is null .
During this enumeration, the key of the guarded Node object is compared with the keys of the iterated:
- if a match is found, the iteration will end and the new Node will overwrite the Node in which the match was found (only its value field will be overwritten );
- if no key matches are found, then the new Node will be the last one in this list, and the previous one will have a next link to it.

Often in interviews, the question flashes: what is a collision ? The situation when a cell of the table[] array stores not one element, but a chain of two or more, is called a collision . In normal cases, when only one element is stored in one table[] cell , accessing the elements of a HashMap has constant time complexity O(1) . But when there is a chain of elements in the cell with the desired element ( collision ), then O(n) , since in this case the time is directly proportional to the number of elements to be sorted through.

10. Tell us about the iterator

In the Collection hierarchy mapping scheme above, the Collection interface was where the whole hierarchy started, but in practice it's not exactly like that. Collection inherits from an interface with an iterator() method that returns an object that implements the Iterator<E> interface . The Iterator interface looks like this:

public interface Iterator <E>{

    E next();
    boolean hasNext();
    void remove();
}

next() - by calling this method, you can get the next element. hasNext() - makes it possible to find out if there is a next element and if the end of the collection has been reached. And when there are still elements, hasNext() will return true . Typically, hasNext() is called before the next() method , because when the end of the collection is reached, next() will throw a NoSuchElementException . remove() - removes the element that was obtained by the last call to next() . The purpose of an Iterator is to iterate over elements. For example:

Set<Integer> values = new TreeSet<>();
  values.add(5);
values.add(3);
values.add(6);
values.add(8);
values.add(2);
values.add(4);
values.add(1);
values.add(7);

Iterator<Integer> iter = values.iterator();
while(iter.hasNext()){
  System.out.println(iter.next());
}

Actually, the for-each loop is implemented under the hood using an iterator. You can read more about this here . List provides its own version of the iterator, but more cool and fancy - ListIterator . This interface extends Iterator and has additional methods:

hasPrevious will return true if the collection has a previous element, false otherwise;
previous returns the current element and jumps to the previous one; if there is none, then a NoSuchElementException is thrown;
add will insert the passed object before the element to be returned by the next call to next() ;
set assigns a reference to the passed object to the current element;
nextIndex returns the index of the next element. If there is none, then the size of the list is returned;
previousIndex returns the index of the previous element. If there is none, then -1 is returned.

Well, that's all for me today. I hope that after reading this article you have become even closer to your cherished dream - to become a developer.

Interview Questions: Java Data Structures. Part 2 - 3

Comments

TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION