JavaRush /Java Blog /Random EN /How classes are loaded in the JVM
Aleksandr Zimin
Level 1
Санкт-Петербург

How classes are loaded in the JVM

Published in the Random EN group
After the most difficult part of the programmer's work has been completed and the "Hello World 2.0" application has been written, it remains to assemble the distribution kit and transfer it to the customer, or at least to the testing service. In the distribution kit, everything is as it should be, and by launching our program, the Java Virtual Machine enters the scene. It's no secret that the virtual machine reads the commands presented in class files as bytecode and translates them as instructions to the processor. I propose to understand a little about the scheme for getting bytecode into a virtual machine.

class loader

Used to supply compiled bytecode to the JVM, which is usually stored in files with a .class. How classes are loaded in the JVM - 1According to the Java SE specification, in order to get the code running in the JVM, you must complete three steps:
  • loading bytecode from resources and instantiating a classClass

    this includes searching for the requested class among previously loaded ones, getting the bytecode to load and checking its correctness, creating an instance of the class Class(to work with it at runtime), loading parent classes. If parent classes and interfaces have not been loaded, then the class in question is considered not loaded.

  • linking (or linking)

    According to the specification, this stage is divided into three more stages:

    • Verification , the correctness of the received bytecode is checked.
    • Preparation , allocating RAM for static fields and initializing them with default values ​​(in this case, explicit initialization, if any, occurs already at the initialization stage).
    • Resolution , the resolution of symbolic references of types, fields, and methods.
  • initialization of the received object

    here, unlike the previous paragraphs, everything seems to be clear what should happen. It would certainly be interesting to understand exactly how this happens.

All these stages are performed sequentially with the following requirements:
  • The class must be fully loaded before being linked.
  • A class must be fully tested and prepared before being initialized.
  • Link resolution errors occur during program execution, even if they are detected during the linking phase.
As you know, Java implements lazy (or lazy) loading of classes. And this means that the loading of the classes of the reference fields of the loaded class will not be performed until the application encounters an explicit call to them. In other words, resolving symbolic links is optional and does not happen by default. However, the JVM implementation can also use vigorous class loading, i.e. all symbolic links must be taken into account at once. Here is the last requirement for this item. It is also worth noting that the resolution of symbolic links is not tied to any of the stages of class loading. In general, each of these stages draws on such a good study, let's try to figure out the first one, namely, loading the bytecode.

Java loader types

There are three standard loaders in Java, each of which loads a class from a specific location:
  1. Bootstrap is a basic loader, also called Primordial ClassLoader.

    loads standard JDK classes from rt.jar archive

  2. Extension ClassLoader - extension loader.

    loads extension classes that are by default located in the jre/lib/ext directory but can be set by the java.ext.dirs system property

  3. System ClassLoader - system loader.

    loads the application classes defined in the CLASSPATH environment variable

Java uses a hierarchy of class loaders, where the root, of course, is the base. Next comes the extension loader, followed by the system loader. Naturally, each loader keeps a pointer to the parent in order to be able to delegate loading to it in the event that it itself is not able to do this.

Abstract class ClassLoader

Every loader, with the exception of the base loader, is a descendant of the abstract class java.lang.ClassLoader. For example, the implementation of the extension loader is the class sun.misc.Launcher$ExtClassLoader, and the implementation of the system loader is the sun.misc.Launcher$AppClassLoader. The base loader is native and its implementation is included in the JVM. Any class that extends java.lang.ClassLoadercan provide its own way of loading classes with blackjack and these. To do this, it is necessary to override the corresponding methods, which at the moment I can only superficially consider, because. did not understand this issue in detail. Here they are:
package java.lang;
public abstract class ClassLoader {
    public Class<?> loadClass(String name);
    protected Class<?> loadClass(String name, boolean resolve);
    protected final Class<?> findLoadedClass(String name);
    public final ClassLoader getParent();
    protected Class<?> findClass(String name);
    protected final void resolveClass(Class<?> c);
}
loadClass(String name)one of the few public methods that is the entry point for class loading. Its implementation boils down to calling another protected method loadClass(String name, boolean resolve), and it must be overridden. If you look at the Javadoc of this protected method, you can understand something like the following - two parameters are supplied as input. One is the binary name of the class (or fully qualified class name) to be loaded. The class name is specified with a listing of all packages. The second parameter is a flag that determines whether the symbolic link resolution procedure should be performed. It defaults to false , which means that classes are lazy-loaded. Further, according to the documentation, the default method implementation callsfindLoadedClass(String name), which checks if the class has already been loaded before, and if so, returns a reference to that class. Otherwise, the class load method of the parent loader will be called. If none of the loaders could find the loaded class, each of them, following in reverse order, will try to find and load this class, overriding the findClass(String name). More about this will be discussed in the chapter “Class loading scheme”. And finally, last but not least, after the class has been successfully loaded, depending on the resolve flag , it will be decided whether to load classes by symbolic links. A clear example that the Resolution stage can be called during the class loading stage. Accordingly, extending the classClassLoaderand by overriding its methods, the custom loader can implement its own logic of delivering bytecode to the virtual machine. Java also supports the notion of a "current" class loader. The current loader is the one that loaded the currently executing class. Each class knows which loader it was loaded with, and you can get this information by calling its String.class.getClassLoader(). For all application classes, the "current" loader is usually the system loader.

Three Principles of Class Loading

  • Delegation

    The request to load the class is passed to the parent loader, and an attempt to load the class itself is made only if the parent loader was unable to find and load the class. This approach allows you to load classes with the loader that is as close as possible to the base one. This achieves the maximum scope of classes. Each loader keeps a record of the classes that have been loaded by it, placing them in its cache. The set of these classes is called the scope.

  • Visibility

    The loader sees only "its" classes and the classes of the "parent" and has no idea about the classes that were loaded by its "child".

  • Uniqueness

    A class can only be loaded once. The delegation mechanism allows you to make sure that the loader that initiates the class loading does not overload the class loaded earlier in the JVM.

Thus, when writing his loader, the developer should be guided by these three principles.

Class loading schema

When a call to load a class occurs, it searches for that class in the cache of already loaded classes of the current loader. If the desired class has not yet been loaded before, according to the principle of delegation, control is transferred to the parent loader, which is one level higher in the hierarchy. The parent loader also tries to find the desired class in its cache. If the class has already been loaded and the loader is aware of its location, then an object will be returned.Classthis class. If not, the search will continue until it reaches the base loader. If the base loader also does not have information about the class being searched for (i.e. it has not yet been loaded), the bytecode of this class will be searched for the class location that this loader knows about, and if the class fails to load, control returns back to the child loader, which will attempt to load from sources known to it. As mentioned above, the class location for the base loader is the rt.jar library, for the extension loader it is the directory with jre/lib/ext extensions, for the system loader it is CLASSPATH, for the user it can be something else. Thus, the progress of class loading goes in the opposite direction - from the root loader to the current one. When the class bytecode is found,Class. As you can see, the described loading scheme is similar to the above implementation of the loadClass(String name). Below you can see this scheme in the diagram.
How classes are loaded in the JVM - 2

As a conclusion

In the first steps of learning the language, there is no particular need to understand how classes are loaded in Java, but knowing these basic principles will allow you not to fall into despair when encountering errors such as ClassNotFoundExceptionor NoClassDefFoundError. Well, or at least roughly understand what the root of the problem is. This is how an exception ClassNotFoundExceptionoccurs when a class is loaded dynamically during program execution, when loaders cannot find the required class either in the cache or along the class path. And here is the errorNoClassDefFoundErroris more critical and occurs when the required class was available during compilation, but not visible during program execution. This can happen if you forgot to include the library it uses in the distribution of the program. Well, the very fact of understanding the principles of the structure of the tool that you use in your work (not necessarily a clear and detailed immersion into its depths) adds some clarity to the understanding of the processes occurring inside this mechanism, which, in turn, leads to the confident use of this tool.

Sources

How ClassLoader Works in Java Overall a very useful source with an accessible presentation of information. Loading classes, ClassLoader A rather voluminous article, but with an emphasis on how to make your own implementation of the loader with these very ones. ClassLoader: Dynamic Class Loading Unfortunately, this resource is not currently available, but there I found the most understandable class loading diagram, so I could not help but add. Java SE Specification: Chapter 5. Loading, Linking, and Initializing
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION