Those who are just starting to get acquainted with Java often get confused about the concepts of
machine code and
byte code . What are they? What are the differences? In a short note, we will try to describe their features as simply and clearly as possible in order to close this issue once and for all.
Machine code
The processor is, in essence, a very complex and advanced calculator. It has many memory locations (called registers) on and between which various mathematical and byte operations are performed. Machine code is precisely a description of the sequence of operations and the set of data involved. In fact, it is the only language that your computer's processor understands.
Congenital incompatibility
At the same time, not all processors “speak” the same language. There are differences not only between
CISC and
RISC architectures , but also within these “camps”.
CISC (Complex Instruction Set Computing) is a processor design concept that is characterized by the following set of properties:
- many commands, different in length;
- many addressing modes;
- complex instruction coding.
RISC (Reduced Instruction Set Computing) - a processor with a reduced instruction set. The commands are of the same format, short, with simple coding. |
New generations of processors introduce additional sets of instructions that are simply unknown to older generation models. Because of this, programs compiled for one architecture (or one generation of processors) cannot run on other hardware. All this forces us to recompile programs to ensure they work on other computers. However, you have to recompile not only because of processors, but also because of differences in the interaction of programs and the operating system. It is because of them that it is impossible to run a “Windows” program under Linux, and a “Linux” program under Windows.
Bytecode
Bytecode is in many ways similar to machine code, only it uses a set of instructions not from a real processor, but from a virtual one. Moreover, it may include sections focused on the use of
a JIT compiler , which optimizes the execution of commands for the real processor on which the program is running.
JIT compilation (Just-in-time compilation, on-the-fly compilation) or dynamic compilation (dynamic translation) is a technology for increasing the performance of software systems that use bytecode by compiling the bytecode into machine code or to another format directly while the program is running. “Officially” in Java until version 9 there was only a JIT compiler. In Java 9, another compiler has appeared, and it compiles ahead of time (AoT). This feature allows Java classes to be compiled into native code before running on a virtual machine. This feature is designed to improve startup times for both small and large applications, with limited impact on peak performance. |
For
CISC processors, some instructions can be combined into more complex structures supported by the processor, and for
RISC , on the contrary, they can be broken down into simpler sequences of instructions.
Also a virtual OS
However, byte code contains not only processor instructions. It also contains logic for interacting with a virtual operating system, which makes the application's behavior independent of the operating system used on the computer. This is clearly visible in
the JVM , where work with system calls and
the GUI are often independent of the OS on which the program is running. By and large,
the JVM emulates the launch of a program process, unlike solutions like
Virtual Box , which create only a virtual system/hardware.
Is JVM the only one like this?
Definitely not. The same
DotNet CLI is also a virtual machine, which is most often used on computers running
Windows with x86 compatible processors. However, there is its implementation for other systems: applications for it must run on
Windows RT running on
ARM (RISC) compatible processors, or you can run them on
Linux/OSX in the
Mono environment , which is a third-party (and therefore not fully compatible) implementation of
DotNet for these platforms. So this platform, like the
JVM , runs on different processors and different OSes. There are many more similar solutions (both old and new):
LLVM ,
Flash SWF , and others. Some programming languages have their own virtual machines. For example,
CPython compiles PY sources into
PYC files - compiled byte code that is prepared to run in
PVM . Or there is a much older example -
Lisp can be compiled into
FASL (Fast Load) files. In fact, they contain
an AST tree built by the generator from the source code. These files can be read and executed by the
Lisp interpreter on various platforms, or used to generate machine code for the currently used hardware architecture.
GO TO FULL VERSION