JavaRush /Java Blog /Random EN /What they might ask at an interview: Data structures in J...

Level 36

28 February 2021
47 views
0 comments

What they might ask at an interview: Data structures in Java. Part 2

PART 1 Now we are talking about the base that every Java developer should know. About that classical knowledge from which it all begins. Today I would like to touch on one of the fundamental topics of any interview - data structures in Java . So, instead of beating around the bush, let's get started. Catch the continuation of the list of questions that you may be asked on this topic during an interview.

6. Tell us about List

List is an interface that represents an ordered structure of objects, which is called a list. What They May Ask During an Interview: Data Structures in Java - 5

What They May Ask During an Interview: Data Structures in Java - 5

The “trick” of this structure is that the elements contained in the List can be inserted, modified or deleted by index, that is, the internal identifier of the List . In other words, index means: “how many elements are from the beginning of the list.” The first List element has index 0, the second has index 1, and so on. So the fifth element is four elements away from the beginning of the list. As mentioned above, the order in which items are added to a list is important. That's why the data structure is called a list . We list methods unique to this structure that are aimed at working with elements by index:

get - returns the element at the specified position (by index value),
remove - removes the element at the specified position,
set - replaces the element at the specified position with the element specified in the method.

The main implementations are ArrayList and LinkedList . We'll talk more about them a little later. Vector is a list that is thread-safe, so each method in this class is synchronized. But keep in mind that if you want to secure some list actions, you will be synchronizing a whole sequence of operations. And synchronizing individual operations is both less secure and much slower. Of course, Vector also has locking overhead, even if you don't need the lock. Therefore, this class is now considered obsolete and is not used. By the way: ArrayList is similar to Vector , but does not use locking, so it is used everywhere. Stack is a subclass of the Vector class with one default constructor and all the methods of the Vector class , plus a few of its own (we'll talk about them later). As an example, you can imagine the process as a stack of folders with documents. You place one folder at the top of the stack, and you can only take these folders in reverse order, starting from the top. Actually, this is a LIFO type mechanism , that is, Last In First Out , the last one to come is the first to leave. The stack implements its own methods:

push - adds the passed element to the top of the stack;
peek - returns the element that is at the top of the stack;
pop - also returns the element that is at the top of the stack, but removes it;
empty - checks whether the stack is empty - true or not - false ;
search - searches the stack for a given element. If an element is found, its sequence number relative to the top of the stack is returned. If the element is not found, the value -1 is returned.

At the moment, the Stack subclass is not actually used due to its simplicity and inflexibility, but, nevertheless, you may come across it. For example, when you receive some error and in the console you see a stack of messages about it. You can read more about the stack and queue in this article .

7. Tell us about Map

As stated above, a Map is a collection that has a separate structure of interfaces and their implementations. It is separate because here the values are not stored one at a time, but in a “key-value” pair. What They May Ask During an Interview: Data Structures in Java - 6

What They May Ask During an Interview: Data Structures in Java - 6

Basic Map methods :

put(K key, V value) - adding an element to the Map;
get(Object key) - search for a value by key;
containsKey(Object key) — checks the Map for the presence of this key;
containsValue(Object value) - checks the Map for the presence of this value;
remove(Object key) - removing a value by its key.

As you can see, most operations work by using a key. As a rule, immutable objects are chosen as keys . A typical example of this object is String . Basic Map implementations :

HashMap - designed to store values in random order, but allows you to quickly search for map elements. Allows you to specify a key using the null keyword , but not more than once, because pairs with the same keys are written on top of each other. The main condition is the uniqueness of the keys: the values can be repeated (there may be several null values).
LinkedHashMap is an analogue of HashMap that stores values in the order they were added. Accordingly, like LinkedList , it has a header - the head of a doubly linked list. When initialized, it points to itself.
LinkedHashMap also has an accessOrder which specifies how the elements will be accessed when the iterator is used. If accessOrder is false, access will be performed in the order the elements were inserted. If true, the elements will be in last accessed order (the last accessed element will be placed at the end).
TreeMap is a Map that sorts elements by key values. Similar to TreeSet , but for pairs based on key values. To set TreeMap sorting rules, keys must implement the Comparable interface. Otherwise, there should be a Key-oriented Comparator (the one that is specified in the TreeMap constructor ), a TreeSet - implemented with a TreeMap object inside, in which, in fact, all the magic happens.
You can read more about sorting in TreeMap using red-black trees in the article about the features of TreeMap .
Hashtable is similar to HashMap , but does not allow nulls to be stored either as keys or values. It is carefully synchronized from a multi-threading point of view, which in turn means that it is safe from a multi-threading point of view. But this implementation is outdated and slow, so now you won’t see Hashtable in more or less new projects.

8. ArrayList vs LinkedList. Which one is preferable to use?

This question is perhaps the most popular on data structures and carries with it some pitfalls. Before answering it, let's learn more about these data structures. ArrayList implements the List interface and operates on an internal array that is expanded as needed. When the internal array is completely filled, and a new element needs to be inserted, a new array is created with a size of (oldSize * 1.5) +1. After this, all data from the old array is copied to the new + new element, and the old one will be deleted by the garbage collector . The add method adds an element to the last empty cell of the array. That is, if we already have 3 elements there, it will add the next one to the 4th cell. Let's go through the performance of the basic methods:

get(int index) - taking an element in an array by index is the fastest in O(1) ;
add(Object obj) - if there is enough space in the internal array for a new element, then with a normal insertion O(1) time will be spent , since the addition is targeted to the last cell.
If we need to create a new array and copy the contents into it, then our time will be directly proportional to the number of elements in the array O(n) ;
remove(int index) - when removing an element, for example, from the middle, we will get O(n/2) time, since we will need to move the elements to the right of it one cell back. Accordingly, if deleting from the beginning of the list, then O(n), from the end - O(1);
add(int index, Object obj) - a situation similar to deleting: when adding to the middle, we will need to move the elements on the right one cell forward, so the time is O(n/2). Of course, from the beginning - O(n), from the end - O(1);
set(int index, Object obj) - here the situation is different, since you only need to find the desired element and write over it without moving the rest, so O(1).

Read more about ArrayList in this article . LinkedList implements two interfaces at once - List and Queue , and therefore has properties and methods inherent in both data structures. From List he took access to an element by index, from Queue - the presence of a “head” and “tail”. Internally, it is implemented as a data structure representing a doubly linked list. That is, each element has a link to the next and previous one, except for the “tail” and “head”.

get(int index) - when searching for an element that is in the middle of the list, it starts searching through all the elements in order until the desired one is found. Logically, the search should take O(n/2) , but LinkedList also has a tail, so the search is carried out simultaneously from both sides. Accordingly, the time is reduced to O(n/4) .
If the element is close to the beginning of the list or the end, then the time will be O(1) ;
add(Object obj) - when adding a new element, the “tail” element will have a link to the next element added, and the new one will receive a link to this previous element and become the new “tail”. Accordingly, the time will be O(1) ;
remove(int index) - logic similar to the get(int index) method . To remove an element from the middle of the list, you must first find it. This is again O(n/4) , while the deletion itself actually takes nothing, since it only changes the pointer of neighboring objects (they begin to refer to each other). If the element is at the beginning or at the end, then again - O(1) ;
add(int index, Object obj) and set(int index, Object obj) - the methods will have a time complexity identical to get(int index) , since most of the time is spent searching for the element. Therefore, for the middle of the list - O(n/4) , for the beginning - O(1).

More information about working with LinkedList is described in this article . Let's look at all this in a table:

Operation	ArrayList	LinkedList
Get by index get(index)	O(1)	In the middle O(n/4)
Add a new element add(obj)	O(1) If you need to copy an array, then - O(n)	O(1)
Remove element remove(int index)	From the beginning - O(n) From the middle - O(n/2) From the end - O(1)	In the middle - O(n/4) At the end or at the beginning - O(n)
Add element add(int index, Object obj)	Back to top - O(n) To the middle - O(n/2) To the end - O(1)	In the middle - O(n/4) At the end or at the beginning - O(n)
Replace element set(index, obj)	O(1)	In the middle - O(n/4) At the end or at the beginning - O(n)

As you probably already understood, it is impossible to answer this question unambiguously. After all, in different situations they work at different speeds. Therefore, when you are asked a question like this, you should immediately ask what this list will be focused on and what operations will be most often performed. Starting from this, give an answer, but with explanations of why this is so. Let's summarize our comparison: ArrayList:

the best choice if the most frequent operation is searching for an element, overwriting an element;
the worst choice if the operation is insertion and deletion at the beginning-middle, because the shift operations of elements on the right will take place.

LinkedList:

the best choice if our frequent operation is insertion and deletion at the beginning-middle;
worst choice if the most common operation is searching.

9. How are elements stored in a HashMap?

The HashMap collection contains an internal array Node[] table , the cells of which are also called buckets . Node contain:

key - link to the key,
value — reference to the value,
hash - hash value,
next - link to the next Node .

One cell of the table[] array may contain a reference to a Node object with a link to the next Node element , and it may have a link to another, and so on... As a result, these Node elements can form a singly linked list , with elements with a link to the next. In this case, the hash value of the elements of one chain is the same. After a short digression, let's look at how elements are stored in a HashMap :

The key is checked for null . If it is null , then key will be stored in the table[0] cell because the hash code for null is always 0.
If the key is not null , then the key object's hashcode() method is called , which will return its hash code. This hash code is used to determine the array cell where the Node object will be stored.
Next, this hashcode is placed in the internal hash() method , which calculates the hashcode, but within the size of the table[] array .
Next, depending on the hash value, Node is placed in a specific cell in the table[] array .
If the table[] cell used to save the current Node element is not empty, but already has some element, then the Node elements are iterated over the next value until the last element is reached. That is, the one whose next field is null .
During this search, the key of the protected Node object is compared with the keys of the ones being searched:
- if a match is found, the search will end, and the new Node will overwrite the Node in which the match was found (only its value field will be overwritten );
- if no key matches are found, then the new Node will become the last in this list, and the previous one will have a next link to it.

The question often comes up during interviews: what is a conflict ? The situation when a cell of the table[] array stores not one element, but a chain of two or more, is called a collision . In normal cases where only one element is stored in a single table[] cell , accessing the elements of a HashMap has constant O(1) time complexity . But when a cell with the desired element contains a chain of elements ( collision ), then O(n) , since in this case the time is directly proportional to the number of elements being sorted.

10. Explain about iterator

In the Collection hierarchy mapping diagram above, the Collection interface was where the entire hierarchy began, but in practice it's not quite like that. Collection inherits from an interface with an iterator() method , which returns an object that implements the Iterator<E> interface . The Iterator interface looks like this:

public interface Iterator <E>{

    E next();
    boolean hasNext();
    void remove();
}

next() - by calling this method, you can get the next element. hasNext() - allows you to find out whether there is a next element, and whether the end of the collection has been reached. And when there are still elements, hasNext() will return true . Typically, hasNext() is called before the next() method , since next() will throw a NoSuchElementException when it reaches the end of the collection . remove() - Removes the element that was retrieved by the last call to next() . The purpose of Iterator is to iterate over elements. For example:

Set<Integer> values = new TreeSet<>();
  values.add(5);
values.add(3);
values.add(6);
values.add(8);
values.add(2);
values.add(4);
values.add(1);
values.add(7);

Iterator<Integer> iter = values.iterator();
while(iter.hasNext()){
  System.out.println(iter.next());
}

Actually, the for-each loop is implemented under the hood using an iterator. You can read more about this here . List provides its own version of an iterator, but a cooler and more sophisticated one - ListIterator . This interface extends Iterator and has additional methods:

hasPrevious will return true if there is a previous element in the collection, false otherwise;
previous returns the current element and moves to the previous one; if there is none, then a NoSuchElementException is thrown;
add will insert the passed object before the element to be returned by the next call to next() ;
set assigns the current element a reference to the passed object;
nextIndex returns the index of the next element. If there is no such thing, then the size of the list is returned;
previousIndex returns the index of the previous element. If there is none, then the number -1 is returned.

Well, that's all for me today. I hope that after reading this article you are even closer to your cherished dream - to become a developer.

Comments

TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION