JavaRush /Java Blog /Random EN /How classes are loaded into the JVM
Aleksandr Zimin
Level 1
Санкт-Петербург

How classes are loaded into the JVM

Published in the Random EN group
After the most difficult part of a programmer’s work has been completed and the “Hello World 2.0” application has been written, all that remains is to assemble the distribution kit and transfer it to the customer, or at least to the testing service. In the distribution, everything is as it should be, and when we launch our program, the Java Virtual Machine comes onto the scene. It's no secret that the virtual machine reads commands presented in class files in the form of bytecode and translates them as instructions to the processor. I propose to understand a little about the scheme of bytecode getting into the virtual machine.

Class loader

It is used to supply compiled bytecode to the JVM, which is usually stored in files with the extension .class, but can also be obtained from other sources, for example, downloaded over the network or generated by the application itself. How classes are loaded in the JVM - 1According to the Java SE specification, in order to get code running in the JVM, you need to complete three steps:
  • loading bytecode from resources and creating an instance of the classClass

    This includes searching for the requested class among those loaded earlier, obtaining bytecode for loading and checking its correctness, creating an instance of the class Class(for working with it at runtime), and loading parent classes. If parent classes and interfaces have not been loaded, then the class in question is considered not loaded.

  • binding (or linking)

    According to the specification, this stage is divided into three more stages:

    • Verification , the correctness of the received bytecode is checked.
    • Preparation , allocating RAM for static fields and initializing them with default values ​​(in this case, explicit initialization, if any, occurs already at the initialization stage).
    • Resolution , resolution of symbolic links of types, fields and methods.
  • initializing the received object

    here, unlike the previous paragraphs, everything seems to be clear what should happen. It would, of course, be interesting to understand exactly how this happens.

All these steps are performed sequentially with the following requirements:
  • The class must be fully loaded before being linked.
  • A class must be fully tested and prepared before it is initialized.
  • Link resolution errors occur during program execution, even if they were detected at the linking stage.
As you know, Java implements lazy (or lazy) loading of classes. This means that the loading of classes of reference fields of the loaded class will not be performed until the application encounters an explicit reference to them. In other words, resolving symbolic links is optional and does not occur by default. However, the JVM implementation can also use energetic class loading, i.e. all symbolic links must be taken into account immediately. It is for this point that the last requirement applies. It is also worth noting that the resolution of symbolic links is not tied to any stage of class loading. In general, each of these stages makes for a good study; let’s try to figure out the first one, namely loading the bytecode.

Types of Java Loaders

There are three standard loaders in Java, each of which loads a class from a specific location:
  1. Bootstrap is a basic loader, also called Primordial ClassLoader.

    loads standard JDK classes from the rt.jar archive

  2. Extension ClassLoader – extension loader.

    loads extension classes, which are located in the jre/lib/ext directory by default, but can be set by the java.ext.dirs system property

  3. System ClassLoader – system loader.

    loads application classes defined in the CLASSPATH environment variable

Java uses a hierarchy of class loaders, where the root is, of course, the base one. Next comes the extension loader, and then the system loader. Naturally, each loader stores a pointer to the parent in order to be able to delegate loading to it in the event that it itself is unable to do this.

Abstract class ClassLoader

Each loader, with the exception of the base one, is a descendant of the abstract class java.lang.ClassLoader. For example, the implementation of the extension loader is the class sun.misc.Launcher$ExtClassLoader, and the system loader is sun.misc.Launcher$AppClassLoader. The base loader is native and its implementation is included in the JVM. Any class that extends java.lang.ClassLoadercan provide its own way of loading classes with blackjack and these same ones. To do this, it is necessary to redefine the corresponding methods, which at the moment I can only consider superficially, because I didn’t understand this issue in detail. Here they are:
package java.lang;
public abstract class ClassLoader {
    public Class<?> loadClass(String name);
    protected Class<?> loadClass(String name, boolean resolve);
    protected final Class<?> findLoadedClass(String name);
    public final ClassLoader getParent();
    protected Class<?> findClass(String name);
    protected final void resolveClass(Class<?> c);
}
loadClass(String name)one of the few public methods, which is the entry point for loading classes. Its implementation boils down to calling another protected method loadClass(String name, boolean resolve), which needs to be overridden. If you look at the Javadoc of this protected method, you can understand something like the following: two parameters are supplied as input. One is the binary name of the class (or fully qualified class name) that needs to be loaded. The class name is specified with a list of all packages. The second parameter is a flag that determines whether symbolic link resolution is required. By default it is false , which means lazy class loading is used. Next, according to the documentation, in the default implementation of the method there is a call findLoadedClass(String name)that checks whether the class has already been loaded previously and, if so, returns a reference to this class. Otherwise, the class loading method of the parent loader will be called. If none of the loaders could find a loaded class, each of them, following in reverse order, will try to find and load that class, overriding the findClass(String name). This will be discussed in more detail in the chapter “Class Loading Scheme”. And finally, last but not least, after the class has been loaded, depending on the resolve flag , it will be decided whether to load classes via symbolic links. A clear example is that the Resolution stage can be called during the class loading stage. Accordingly, by extending the class ClassLoaderand overriding its methods, the custom loader can implement its own logic for delivering bytecode to the virtual machine. Java also supports the concept of a "current" class loader. The current loader is the one that loaded the currently executing class. Each class knows which loader it was loaded with, and you can get this information by calling its String.class.getClassLoader(). For all application classes, the "current" loader is usually the system one.

Three Principles of Class Loading

  • Delegation

    The request to load the class is passed to the parent loader, and an attempt to load the class itself is made only if the parent loader was unable to find and load the class. This approach allows you to load classes with the loader that is as close as possible to the base one. This achieves maximum class visibility. Each loader keeps a record of the classes that were loaded by it, placing them in its cache. The set of these classes is called the scope.

  • Visibility

    The loader sees only “its” classes and the classes of the “parent” and has no idea about the classes that were loaded by its “child”.

  • Uniqueness

    A class can only be loaded once. The delegation mechanism makes sure that the loader that initiates class loading does not overload a class that was previously loaded into the JVM.

Thus, when writing his bootloader, a developer should be guided by these three principles.

Class loading scheme

When a call to load a class occurs, this class is searched in the cache of already loaded classes of the current loader. If the desired class has not been loaded before, the principle of delegation transfers control to the parent loader, which is located one level higher in the hierarchy. The parent loader also tries to find the desired class in its cache. If the class has already been loaded and the loader knows its location, then an object Classof that class will be returned. If not, the search will continue until it reaches the base bootloader. If the base loader does not have information about the required class (that is, it has not yet been loaded), the bytecode of this class will be searched for in the location of classes that the given loader knows about, and if the class cannot be loaded, control will return back to the child loader, which will try to load from sources known to it. As mentioned above, the location of classes for the base loader is the rt.jar library, for the extension loader - the directory with jre/lib/ext extensions, for the system one - CLASSPATH, for the user one it can be something different. Thus, the progress of loading classes goes in the opposite direction - from the root loader to the current one. When the bytecode of the class is found, the class is loaded into the JVM and an instance of the type is obtained Class. As you can easily see, the described loading scheme is similar to the above implementation of the method loadClass(String name). Below you can see this diagram in the diagram.
How classes are loaded in the JVM - 2

As a conclusion

In the first steps of learning a language, there is no particular need to understand how classes are loaded in Java, but knowing these basic principles will help you avoid despair when encountering errors such as ClassNotFoundExceptionor NoClassDefFoundError. Well, or at least roughly understand what the root of the problem is. Thus, an exception ClassNotFoundExceptionoccurs when a class is dynamically loaded during program execution, when loaders cannot find the required class either in the cache or along the class path. But the error NoClassDefFoundErroris more critical and occurs when the required class was available during compilation, but was not visible during program execution. This can happen if the program forgot to include the library it uses. Well, the very fact of understanding the principles of the structure of the tool that you use in your work (not necessarily a clear and detailed immersion in its depths) adds some clarity to the understanding of the processes occurring inside this mechanism, which, in turn, leads to confident use of this tool.

Sources

How ClassLoader Works in Java Overall a very useful source with an accessible presentation of information. Loading classes, ClassLoader Quite a lengthy article, but with an emphasis on how to make your own loader implementation with these same ones. ClassLoader: dynamic loading of classes Unfortunately, this resource is not available now, but there I found the most understandable diagram with a class loading scheme, so I can’t help but add it. Java SE Specification: Chapter 5. Loading, Linking, and Initializing
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION