JavaRush /Java Blog /Random EN /Coffee break #210. All Types of Garbage Collectors in Jav...

Coffee break #210. All Types of Garbage Collectors in Java You Should Know About

Published in the Random EN group
Source: Hackernoon Through this post, you will learn the strengths and weaknesses of each type of garbage collector used in Java development. Coffee break #210.  All Types of Garbage Collectors in Java You Should Know About - 1The question about Garbage Collector (GC) can be heard in almost every interview. So I decided to collect all the necessary information about them using my favorite principle - short and simple. First, let's start with the purpose of CG and why we need multiple types of garbage collectors. In languages ​​like C, we need to store object information in memory and write a lot of boilerplate code to free up that memory. Of course, memory leaks are common in such programs. Java solves the problem of memory leaks using a garbage collector. And you, as a developer, should know which garbage collector is best to use. A lot depends on where and how your program runs. It may run on weak hardware or with a large number of objects, or your program must be very fast. Based on these conditions, you should tune your garbage collector to achieve the desired performance. So, let's begin.

How the JVM deals with memory

The Java Virtual Machine (JVM) divides memory into two areas: the heap, which stores application data, and the non-heap, which stores program code and other data. Let's turn our attention to the heap area. This is where our program creates new objects. All garbage collectors are based on the fact that many programs use ephemeral objects. That is, these objects were created, then fulfilled their function and are no longer needed. The majority of such objects. But some objects live much longer, perhaps even for the entire duration of the program. This is where the idea of ​​dividing objects into young and old generations arises. And we need to check on the younger generation very often. The fact is that garbage collection processes are divided into minor cleaning, which affects only the younger generation, and complete cleaning, which can affect both generations. Remember that the garbage collector is a program. And it requires time and resources from your computer to work. Which also affects our application. How does it affect? For example, to perform garbage collection, the JVM pauses our application. This is called a Stop-The-World (STW) pause. During this time, all application threads are suspended. But the application inside is completely unaware of this. For the application, time flows evenly. Why is this so bad? Just imagine, you are writing some kind of exchange application or an application for an airplane autopilot. Your application could go to sleep for one second, and the nature of your problem could change dramatically. That is, pause is a significant parameter for every garbage collector. The next fundamental property of the garbage collector is the total time spent collecting garbage in relation to the total execution time of the program. What does this mean and why is it so important? Instead of one big “Stop-The-World” phase, we can choose an algorithm with many small pauses. Small breaks are preferable, but nothing comes for free. In this case, we pay by increasing the total execution time of the program. And we must take this into account too. The next parameter is the amount of hardware resources. Each collector needs memory to store object information and a processor to perform cleanup. The last parameter is speed. Garbage collection efficiency refers to how quickly and efficiently the garbage collector (GC) reclaims memory that is no longer used by a program. All these parameters influence the algorithm, which can free up memory as quickly as possible while consuming minimal resources. Let's take a look at the garbage collectors available to us. For the interview you need to know the first five. The other two are much more difficult.

Serial GC

Serial GC is the Java Virtual Machine's garbage collector and has been used since the beginning of Java. It is useful for programs with a small heap and running on less powerful machines. This garbage collector divides the heap into regions, which include Eden and Survivor. The Eden region is the pool from which memory for most objects is initially allocated. Survivor is a pool containing objects that survived garbage collection in the Eden region. As the heap fills, objects are moved between the Eden and Survivor regions. The JVM constantly monitors the movement of objects into the Survivor regions and selects an appropriate threshold for the number of such movements, after which the objects are moved to the Tenured region. When there is not enough space in the Tenured region, full garbage collection takes over, working on objects of both generations. The main advantage of this garbage collector is its low resource requirements, so a low-power processor is enough to perform the collection. The main disadvantage of Serial GC is the long pauses during garbage collection, especially when it comes to large amounts of data.

Parallel CG

A parallel garbage collector (Parallel CG) is similar to a sequential constructor. It includes parallel processing of some tasks and the ability to automatically tune performance settings. Parallel GC is a Java Virtual Machine garbage collector based on the ideas of Serial GC, but with added parallelism and intelligence. If the computer has more than one processor core, the older version of the JVM automatically selects Parallel GC. The heap here is divided into the same regions as in Serial GC - Eden, Survivor 0, Survivor 1 and Old Gen (Tenured). However, multiple threads participate in garbage collection in parallel, and the collector can adjust to the required performance parameters. Each collector thread has a memory area that needs to be cleared. Parallel GC also has settings aimed at achieving the required garbage collection efficiency. The collector uses statistics from previous garbage collections to tune performance settings for future collections. Parallel GC provides automatic tuning of performance parameters and lower build pause times, but there is one minor drawback in the form of some memory fragmentation. It is suitable for most applications, but for more complex programs it is better to choose more advanced garbage collector implementations. Pros: Faster than Serial GC in many cases. Has good speed. Cons: Consumes more resources and pauses can be quite long, but we can adjust the maximum pause duration of Stop-The-World.

Concurrent Mark Sweep

The Concurrent Mark Sweep (CMS) garbage collector aims to reduce the maximum pause length by running some garbage collection tasks concurrently with application threads. This garbage collector is suitable for managing large amounts of data in memory. Concurrent Mark Sweep (CMS) is an alternative to Parallel GC in the Java Virtual Machine (JVM). It is intended for applications that require access to multiple processor cores and are sensitive to Stop-The-World pauses. The CMS performs garbage collection steps in parallel with the main program, which allows it to run without stopping. It uses the same memory organization as the Serial and Parallel collectors, but does not wait for the Tenured area to be filled before running the old generation purge. Instead, it runs in the background and tries to keep the Tenured region compact. Concurrent Mark Sweep begins with an initial marking phase that briefly stops the application's main threads and marks all objects accessible from root. The application's main threads then resume and the CMS begins searching for all active objects that are accessible by links from the marked root objects. After marking all living objects, the collector clears the memory of dead objects in several parallel threads. One of the benefits of a CMS is its focus on minimizing downtime, which is critical for many applications. However, it requires sacrifice in terms of CPU resources and overall bandwidth. In addition, the CMS does not compress objects in the old generation, which leads to fragmentation. Long pauses due to possible parallel mode failures can be an unpleasant surprise (although they do not happen often). If there is enough memory, the CMS can avoid such pauses. Pros: Fast. Has small Stop-The-World pauses. Cons: consumes more memory; if there is insufficient memory, some pauses may be long. Not very good if the application creates a lot of objects.

Garbage-First

Garbage-First (G1) is considered an alternative to a CMS, especially for server applications running on multi-processor servers and managing large data sets. The G1 garbage collector converts memory into multiple equally sized regions, with the exception of huge regions (which are created by merging regular regions to accommodate massive objects). Regions do not have to be organized in a row and can change their generation affiliation. Small purges are performed periodically for the younger generation and moving objects to Survivor regions or upgrading them to the older generation and transferring them to Tenured. Cleaning is carried out only in those regions where it is necessary to avoid exceeding the desired time. The collector himself predicts and selects regions with the largest amount of garbage for cleaning. Full sweeps use a marking loop to create a list of live objects that runs in parallel with the main application. After the marking cycle, G1 switches to running mixed purges, which add older generation regions to the set of younger generation regions to be purged. The G1 garbage collector is considered to be more accurate than the CMS collector in predicting pause sizes and better distributes garbage collection over time to prevent long application downtimes, especially with large heap sizes. It also doesn't fragment memory like the CMS collector. However, the G1 collector requires more CPU resources to run in parallel with the main program, which reduces application throughput. Pros: Works better than CMS. Has shorter pauses. Cons: Consumes more CPU resources. Also it consumes more memory if we have many quite large objects (more than 500 KB) because it puts such objects in one region (1-32 MB).

Epsilon GC

Epsilon GC is designed for situations where garbage collection is not required. It does not perform garbage collection, but uses TLAB (thread-local allocation buffers) to allocate new objects - small memory buffers requested by individual threads from the heap. Huge objects that do not fit into the buffer request memory blocks specifically for themselves. When the Epsilon GC runs out of resources, an OutOfMemoryError is generated and the process terminates. Benefits of Epsilon GC include lower resource requirements and faster memory allocation for applications that create all the objects they need at startup or run short-lived applications that do not use all of the allocated memory. Epsilon GC can also help analyze the resource requirements that other garbage collectors add to your application. Pros: Very fast. Cons: Doesn't clear objects :) The next two collectors are the most advanced of their kind, but also the most complex. Therefore, we will consider them briefly.

ZGC

ZGC can maintain sub-millisecond latency even when dealing with huge amounts of data. ZGC is a garbage collector developed by Oracle for Java that is designed to provide high throughput and low latency when processing large heaps (up to 16 TB). ZGC is based on virtual memory principles and uses different color markings to track the state of objects during garbage collection. Pros: Pauses are less than a millisecond, even on large heaps, which is very useful for applications that require short query processing times. It works with very large heaps with good throughput. ZGC can compress heap memory during garbage collection. Cons: High CPU usage and significant performance requirements, which can slow down application launch times.

Shenandoah G.C.

Shenandoah GC is another garbage collector with short pauses regardless of heap size. This garbage collector is developed by Red Hat. It is designed to minimize the time an application spends on garbage collection. Like ZGC, it is a parallel collector, which means it runs while the application is running, minimizing pauses. Shenandoah GC uses “forwarding pointers” to move objects during garbage collection. It also has a technique called “load barrier removal” to improve performance. Pros: Shenandoah GC can achieve short pause times, often less than 10ms, even for massive heaps. Good throughput. Cons: high processor load and difficulty in working under heavy loads.

Conclusion

Garbage collectors are one of the most difficult tasks in programming. New developments are constantly being carried out in this direction. While it's rare for programmers to tweak the GC, you still need to have at least a passing knowledge of how your garbage collection tool works.
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION