JavaRush /Java Blog /Random EN /Sorting algorithms in theory and practice

Level 3

8 August 2023
244 views
0 comments

Sorting algorithms in theory and practice

Sorting is one of the basic types of activities or actions performed on objects. Even in childhood, children are taught to sort, developing their thinking. Computers and programs are also no exception. There are a huge variety of algorithms. I suggest you look at what they are and how they work. Plus, what if one day you get asked about one of these in an interview?

Introduction

Sorting elements is one of the categories of algorithms that a developer must get used to. If once upon a time, when I was studying, computer science was not taken so seriously, now in school they should be able to implement sorting algorithms and understand them. Basic algorithms, the simplest ones, are implemented using a loop for. Naturally, in order to sort a collection of elements, for example, an array, you need to somehow traverse this collection. For example:

int[] array = {10, 2, 10, 3, 1, 2, 5};
for (int i = 0; i < array.length; i++) {
	System.out.println(array[i]);
}

What can you say about this piece of code? We have a loop in which we change the index value ( int i) from 0 to the last element in the array. Essentially, we are simply taking each element in the array and printing its contents. The more elements in the array, the longer the code will take to execute. That is, if n is the number of elements, with n=10 the program will take 2 times longer to execute than with n=5. When our program has one loop, the execution time increases linearly: the more elements, the longer the execution. It turns out that the algorithm of the code above runs in linear time (n). In such cases, the “algorithm complexity” is said to be O(n). This notation is also called "big O" or "asymptotic behavior". But you can simply remember “the complexity of the algorithm.”

Sorting algorithms in theory and practice - 2

The simplest sorting (Bubble Sort)

So, we have an array and can iterate over it. Great. Let's now try to sort it in ascending order. What does this mean for us? This means that given two elements (for example, a=6, b=5), we must swap a and b if a is greater than b (if a > b). What does this mean for us when working with a collection by index (as is the case with an array)? This means that if the element with index a is greater than the element with index b, (array[a] > array[b]), such elements must be swapped. Changing places is often called swap. There are various ways to change places. But we use simple, clear and easy to remember code:

private void swap(int[] array, int ind1, int ind2) {
    int tmp = array[ind1];
    array[ind1] = array[ind2];
    array[ind2] = tmp;
}

Now we can write the following:

int[] array = {10, 2, 10, 3, 1, 2, 5};
System.out.println(Arrays.toString(array));
for (int i = 1; i < array.length; i++) {
	if (array[i] < array[i - 1]) {
		swap(array, i, i-1);
	}
}
System.out.println(Arrays.toString(array));

As we can see, the elements have indeed changed places. We started with one element, because... if the array consists of only one element, the expression 1 < 1 will not return true and thus we will protect ourselves from cases of an array with one element or without them at all, and the code will look better. But our final array is not sorted anyway, because... It is not possible to sort everyone in one pass. We will have to add another loop in which we will perform passes one by one until we get a sorted array:

int[] array = {10, 2, 10, 3, 1, 2, 5};
System.out.println(Arrays.toString(array));
boolean needIteration = true;
while (needIteration) {
	needIteration = false;
	for (int i = 1; i < array.length; i++) {
		if (array[i] < array[i - 1]) {
			swap(array, i, i-1);
			needIteration = true;
		}
	}
}
System.out.println(Arrays.toString(array));

So our first sorting worked. We iterate in the outer loop ( while) until we decide that no more iterations are needed. By default, before each new iteration, we assume that our array is sorted, and we don’t want to iterate any more. Therefore, we go through the elements sequentially and check this assumption. But if the elements are out of order, we swap the elements and realize that we are not sure that the elements are now in the correct order. Therefore, we want to perform one more iteration. For example, [3, 5, 2]. 5 is more than three, everything is fine. But 2 is less than 5. However, [3, 2, 5] requires one more pass, because 3 > 2 and they need to be swapped. Since we use a loop within a loop, it turns out that the complexity of our algorithm increases. With n elements it becomes n * n, that is O(n^2). This complexity is called quadratic. As we understand, we cannot know exactly how many iterations will be needed. The algorithm complexity indicator serves the purpose of showing the trend of increasing complexity, the worst case. How much will the running time increase when the number of elements n changes. Bubble sort is one of the simplest and most inefficient sorts. It is also sometimes called "stupid sorting". Related material:

" HackerRank: Algorithms - Bubble Sort "

Sorting algorithms in theory and practice - 3

Selection Sort

Another sort is selection sort. It also has quadratic complexity, but more on that later. So the idea is simple. Each pass selects the smallest element and moves it to the beginning. In this case, start each new pass by moving to the right, that is, the first pass - from the first element, the second pass - from the second. It will look like this:

int[] array = {10, 2, 10, 3, 1, 2, 5};
System.out.println(Arrays.toString(array));
for (int left = 0; left < array.length; left++) {
	int minInd = left;
	for (int i = left; i < array.length; i++) {
		if (array[i] < array[minInd]) {
			minInd = i;
		}
	}
	swap(array, left, minInd);
}
System.out.println(Arrays.toString(array));

This sorting is unstable, because identical elements (from the point of view of the characteristic by which we sort the elements) can change their position. A good example is given in the Wikipedia article: Sorting_by-selection . Related material:

Sorting algorithms in theory and practice - 4

Insertion Sort

Insertion sort also has quadratic complexity, since we again have a loop within a loop. How is it different from selection sort? This sorting is "stable". This means that identical elements will not change their order. Identical in terms of the characteristics by which we sort.

int[] array = {10, 2, 10, 3, 1, 2, 5};
System.out.println(Arrays.toString(array));
for (int left = 0; left < array.length; left++) {
	// Retrieve the value of the element
	int value = array[left];
	// Move through the elements that are before the pulled element
	int i = left - 1;
	for (; i >= 0; i--) {
		// If a smaller value is pulled out, move the larger element further
		if (value < array[i]) {
			array[i + 1] = array[i];
		} else {
			// If the pulled element is larger, stop
			break;
		}
	}
	// Insert the extracted value into the freed space
	array[i + 1] = value;
}
System.out.println(Arrays.toString(array));

Related material:

Sorting algorithms in theory and practice - 5

Shuttle Sort

Among the simple sorts, there is another one - shuttle sorting. But I prefer the shuttle variety. It seems to me that we rarely talk about space shuttles, and the shuttle is more of a run. Therefore, it is easier to imagine how shuttles are launched into space. Here's an association with this algorithm. What is the essence of the algorithm? The essence of the algorithm is that we iterate from left to right, and when swapping elements, we check all other elements that are left behind to see if the swap needs to be repeated.

int[] array = {10, 2, 10, 3, 1, 2, 5};
System.out.println(Arrays.toString(array));
for (int i = 1; i < array.length; i++) {
	if (array[i] < array[i - 1]) {
		swap(array, i, i - 1);
		for (int z = i - 1; (z - 1) >= 0; z--) {
			if (array[z] < array[z - 1]) {
				swap(array, z, z - 1);
			} else {
				break;
			}
		}
	}
}
System.out.println(Arrays.toString(array));

Related material:

AQA Decision 1 2.02a Introducing Shuttle Sort by Jack Brown

Sorting algorithms in theory and practice - 6

Shell sort

Another simple sort is Shell sort. Its essence is similar to bubble sort, but each iteration we have a different gap between the compared elements. Each iteration it is halved. Here is an example implementation:

int[] array = {10, 2, 10, 3, 1, 2, 5};
System.out.println(Arrays.toString(array));
// Calculate the gap between the checked elements
int gap = array.length / 2;
// As long as there is a difference between the elements
while (gap >= 1) {
    for (int right = 0; right < array.length; right++) {
        // Shift the right pointer until we can find one that
        // there won't be enough space between it and the element before it
       for (int c = right - gap; c >= 0; c -= gap) {
           if (array[c] > array[c + gap]) {
               swap(array, c, c + gap);
           }
        }
    }
    // Recalculate the gap
    gap = gap / 2;
}
System.out.println(Arrays.toString(array));

Related material:

Sorting algorithms in theory and practice - 7

Merge sort

In addition to the simple sorts indicated, there are more complex sorts. For example, merge sort. First, recursion will come to our aid. Secondly, our complexity will no longer be quadratic, as we are used to. The complexity of this algorithm is logarithmic. Written as O(n log n). So let's do this. First, let's write a recursive call to the sorting method:

public static void mergeSort(int[] source, int left, int right) {
        // Choose a separator, i.e. split the input array in half
        int delimiter = left + ((right - left) / 2) + 1;
        // Execute this function recursively for the two halves (if we can split(
        if (delimiter > 0 && right > (left + 1)) {
            mergeSort(source, left, delimiter - 1);
            mergeSort(source, delimiter, right);
        }
}

Now, let's add a main action to it. Here is an example of our supermethod with implementation:

public static void mergeSort(int[] source, int left, int right) {
        // Choose a separator, i.e. split the input array in half
        int delimiter = left + ((right - left) / 2) + 1;
        // Execute this function recursively for the two halves (if we can split(
        if (delimiter > 0 && right > (left + 1)) {
            mergeSort(source, left, delimiter - 1);
            mergeSort(source, delimiter, right);
        }
        // Create a temporary array with the desired size
        int[] buffer = new int[right - left + 1];
        // Starting from the specified left border, go through each element
        int cursor = left;
        for (int i = 0; i < buffer.length; i++) {
            // We use the delimeter to point to the element from the right side
            // If delimeter > right, then there are no unadded elements left on the right side
            if (delimiter > right || source[cursor] > source[delimiter]) {
                buffer[i] = source[cursor];
                cursor++;
            } else {
                buffer[i] = source[delimiter];
                delimiter++;
            }
        }
        System.arraycopy(buffer, 0, source, left, buffer.length);
}

Let's run the example by calling the mergeSort(array, 0, array.length-1). As you can see, the essence comes down to the fact that we take as input an array indicating the beginning and end of the section to be sorted. When sorting begins, this is the beginning and end of the array. Next we calculate delimiter - the position of the divider. If the divider can divide into 2 parts, then we call recursive sorting for the sections into which the divider divided the array. We prepare an additional buffer array in which we select the sorted section. Next, we place the cursor at the beginning of the area to be sorted and begin to go through each element of the empty array that we have prepared and fill it with the smallest elements. If the element pointed to by the cursor is smaller than the element pointed to by the divisor, we place this element in the buffer array and move the cursor. Otherwise, we place the element pointed to by the separator into the buffer array and move the separator. As soon as the separator goes beyond the boundaries of the sorted area or we fill the entire array, the specified range is considered sorted. Related material:

Sorting algorithms in theory and practice - 8

Counting Sort and Radix Sort

Another interesting sorting algorithm is Counting Sort. The algorithmic complexity in this case will be O(n+k), where n is the number of elements, and k is the maximum value of the element. There is one problem with the algorithm: we need to know the minimum and maximum values in the array. Here is an example implementation of counting sort:

public static int[] countingSort(int[] theArray, int maxValue) {
        // Array with "counters" ranging from 0 to the maximum value
        int numCounts[] = new int[maxValue + 1];
        // In the corresponding cell (index = value) we increase the counter
        for (int num : theArray) {
            numCounts[num]++;
        }
        // Prepare array for sorted result
        int[] sortedArray = new int[theArray.length];
        int currentSortedIndex = 0;
        // go through the array with "counters"
        for (int n = 0; n < numCounts.length; n++) {
            int count = numCounts[n];
            // go by the number of values
            for (int k = 0; k < count; k++) {
                sortedArray[currentSortedIndex] = n;
                currentSortedIndex++;
            }
        }
        return sortedArray;
    }

As we understand, it is very inconvenient when we have to know the minimum and maximum values in advance. And then there is another algorithm - Radix Sort. I will present the algorithm here only visually. For implementation, see materials:

Sorting algorithms in theory and practice - 9

Materials:

Video about Counting Sort

Sorting algorithms in theory and practice - 10

Java Quick Sort

Well, for dessert - one of the most famous algorithms: quick sort. It has algorithmic complexity, that is, we have O(n log n). This sort is also called Hoare sort. Interestingly, the algorithm was invented by Hoare during his stay in the Soviet Union, where he studied computer translation at Moscow University and was developing a Russian-English phrasebook. This algorithm is also used in a more complex implementation in Arrays.sort in Java. What about Collections.sort? I suggest you see for yourself how they are sorted “under the hood”. So, the code:

public static void quickSort(int[] source, int leftBorder, int rightBorder) {
        int leftMarker = leftBorder;
        int rightMarker = rightBorder;
        int pivot = source[(leftMarker + rightMarker) / 2];
        do {
            // Move the left marker from left to right while element is less than pivot
            while (source[leftMarker] < pivot) {
                leftMarker++;
            }
            // Move the right marker until element is greater than pivot
            while (source[rightMarker] > pivot) {
                rightMarker--;
            }
            // Check if you don't need to swap elements pointed to by markers
            if (leftMarker <= rightMarker) {
                // The left marker will only be less than the right marker if we have to swap
                if (leftMarker < rightMarker) {
                    int tmp = source[leftMarker];
                    source[leftMarker] = source[rightMarker];
                    source[rightMarker] = tmp;
                }
                // Move markers to get new borders
                leftMarker++;
                rightMarker--;
            }
        } while (leftMarker <= rightMarker);

        // Execute recursively for parts
        if (leftMarker < rightBorder) {
            quickSort(source, leftMarker, rightBorder);
        }
        if (leftBorder < rightMarker) {
            quickSort(source, leftBorder, rightMarker);
        }
}

Everything here is very scary, so we’ll figure it out. For the input array int[]source, we set two markers, left (L) and right (R). When called for the first time, they match the beginning and end of the array. Next, the supporting element is determined, aka pivot. After this, our task is to move values less than pivot, to the left pivot, and larger ones to the right. To do this, first move the pointer Luntil we find a value greater than pivot. If a smaller value is not found, then L совпадёт с pivot. Потом двигаем указатель R пока не найдём меньшее, чем pivot meaning. Если меньшее meaning не нашли, то R совпадёт с pivot. Далее, если указатель L находится до указателя R or совпадает с ним, то пытаемся выполнить обмен элементов, если элемент L меньше, чем R. Далее L сдвигаем вправо на 1 позицию, R сдвигаем влево на одну позицию. Когда левый маркер L окажется за правым маркером R это будет означать, что обмен закончен, слева от pivot меньшие значения, справа от pivot — большие значения. После этого рекурсивно вызываем такую же сортировку для участков массива от начала сортируемого участка до правого маркера и от левого маркера до конца сортируемого участка. Почему от начала до правого? Потому что в конце итерации так и получится, что правый маркер сдвинется настолько, что станет границей части слева. Этот алгоритм более сложный, чем простая sorting, поэтому его лучше зарисовать. Возьмём белый лист бумаги, запишем: 4 2 6 7 3 , а Pivot по центру, т.е. число 6. Обведём его в круг. Под 4 напишем L, под 3 напишем R. 4 меньше чем 6, 2 меньше чем 6. Total, L переместился на положение pivot, т.к. по условию L не может уйти дальше, чем pivot. Напишем снова 4 2 6 7 3 , обведём 6 вкруг ( pivot) и поставим под ним L. Теперь двигаем указатель R. 3 меньше чем 6, поэтому ставим маркер R на цифру 3. Так How 3 меньше, чем pivot 6 выполняем swap, т.е. обмен. Запишем результат: 4 2 3 7 6 , обводим 6 вкруг, т.к. он по прежнему pivot. Указатель L на цифре 3, указатель R на цифре 6. Мы помним, что двигаем указатели до тех пор, пока L не зайдём за R. L двигаем на следующую цифру. Тут хочется разобрать два варианта: если бы предпоследняя цифра была 7 и если бы она была не 7, а 1. Предпоследня цифра 1: Сдвинули указатель L на цифру 1, т.к. мы можем двигать L до тех пор, пока указатель L указывает на цифру, меньшую чем pivot. А вот R мы не можем сдвинуть с 6, т.к. R не мы можем двигать только если указатель R указывает на цифру, которая больше чем pivot. swap не делаем, т.к. 1 меньше 6. Записываем положение: 4 2 3 1 6, обводим pivot 6. L сдвигается на pivot и больше не двигается. R тоже не двигается. Обмен не производим. Сдвигаем L и R на одну позицию и подписываем цифру 1 маркером R, а L получается вне числа. Т.к. L вне числа — ничего не делаем, а вот часть 4 2 3 1 выписываем снова, т.к. это наша левая часть, меньшая, чем pivot 6. Выделяем новый pivot и начинаем всё снова ) Предпоследняя цифра 7: Сдвинули указать L на цифру 7, правый указатель не можем двигать, т.к. он уже указывает на pivot. т.к. 7 больше, чем pivot, то делаем swap. Запишем результат: 4 2 3 6 7, обводим 6 кружком, т.к. он pivot. Указатель L теперь сдвигается на цифру 7, а указатель R сдвигается на цифру 3. Часть от L до конца нет смысла сортировать, т.к. там всего 1 элемент, а вот часть от 4 до указателя R отправляем на сортировку. Выбираем pivot и начинаем всё снова ) Может на первый взгляд показаться, что если расставить много одинаковых с pivot значений, это сломает алгоритм, но это не так. Можно напридумывать каверзных вариантов и на бумажке убедиться, что всё правильно и подивиться, How такие простые действия предоставляют такой надёжный механизм. Единственный минус — такая sorting не является стабильной. Т.к. при выполнении обмена одинаковые элементы могут поменять свой порядок, если один из них встретился до pivot до того, How другой элемент попал в часть до pivot при помощи обмена. Материал:

Фоксфорд: Алгоритмы поиска и сортировки: Быстрая sorting Хоара

Итог

Выше мы рассмотрели "джентельменский" набор алгоритмов сортировки, реализованных на Java. Алгоритмы вообще штука полезная, How с прикладной точки зрения, так и с точки зрения развития мышления. Некоторые из них попроще, некоторые посложнее. По Howим-то умные люди защищали различные работы на степени, а по другим писали толстые-толстые книги. Надеюсь, приложенный к статье материал позволит вам узнать ещё больше, так How это обзорная статья, которая и так получилась увесистой. И цель её — дать небольшое вступление. Про введение в алгоритмы можно так же прочитать ознакомиться с книгой " Грокаем Алгоримы". Также мне нравится плэйлист от Jack Brown — AQA Decision 1 1.01 Tracing an Algorithm. Ради интереса можно посмотреть на визуализацию алгоритмов на sorting.at и visualgo.net. Ну и весь Youtube к вашим услугам.

Comments

TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION