Что такое runtime data areas
Разбор вопросов и ответов с собеседований на Java-разработчика. Часть 15
Java Core
9. В чем разница между статическим и динамическим связыванием в Java?
10. Можно ли использовать private или protected переменные в interface?
11. Что такое Classloader и для чего используется?
Bootstrap ClassLoader — базовый загрузчик, реализован на уровне JVM и не имеет обратной связи со средой выполнения, так как является частью ядра JVM и написан в машинном коде. Данный загрузчик служит родительским элементом для всех других экземпляров ClassLoader.
В основном отвечает за загрузку внутренних классов JDK, обычно rt.jar и других основных библиотек, расположенных в каталоге $ JAVA_HOME / jre / lib. У разных платформ могут быть разные реализации этого загрузчика классов.
Extension Classloader — загрузчик расширений, потомок класса базового загрузчика. Заботится о загрузке расширения стандартных базовых классов Java. Загружается из каталога расширений JDK, обычно — $ JAVA_HOME / lib / ext или любого другого каталога, упомянутого в системном свойстве java.ext.dirs (с помощью данной опции можно управлять загрузкой расширений).
System ClassLoader — системный загрузчик, реализованный на уровне JRE, который заботится о загрузке всех классов уровня приложения в JVM. Он загружает файлы, найденные в переменном окружении классов -classpath или -cp опции командной строки.
System Classloader пытается найти класс в своем кеше.
1.1. Если класс найден, загрузка успешно завершена.
1.2. Если класс не найден, загрузка делегируется к Extension Classloader-у.
Extension Classloader пытается найти класс в собственном кеше.
2.1. Если класс найден — успешно завершена.
2.2. Если класс не найден, загрузка делегируется Bootstrap Classloader-у.
Bootstrap Classloader пытается найти класс в собственном кеше.
3.1. Если класс найден, загрузка успешно завершена.
3.2. Если класс не найден, базовый Bootstrap Classloader попытается его загрузить.
4.1. Прошла успешно — загрузка класса завершена.
4.2. Не прошла успешно — управление передается к Extension Classloader.
5. Extension Classloader пытается загрузить класс, и если загрузка:
5.1. Прошла успешно — загрузка класса завершена.
5.2. Не прошла успешно — управление передается к System Classloader.
6. System Classloader пытается загрузить класс, и если загрузка:
6.1. Прошла успешно — загрузка класса завершена.
6.2. Не прошла успешно — генерируется исключение — ClassNotFoundException.
Тема загрузчиков классов обширна и ею не стоит пренебрегать. Чтобы ознакомиться с ней подробнее, советую прочесть эту статью, а мы не будем задерживаться и пойдем дальше.
12. Что такое Run-Time Data Areas?
PC Register — регистр ПК — локален для каждого потока и содержит адрес инструкции JVM, которую поток выполняет в данный момент.
JVM Stack — область памяти, которая используется как хранилище для локальных переменных и временных результатов. У каждого потока есть свой отдельный стек: как только поток завершается, этот стек также уничтожается. Стоит отметить, что преимуществом stack над heap является производительность, в то время как heap безусловно имеет преимущество в масштабе хранилища.
Native Method Stack — область данных для каждого потока, в которой хранятся элементы данных, аналогичные стеку JVM, для выполнения собственных (не Java) методов.
Heap — используется всеми потоками как хранилище которое содержит объекты, метаданные классов, массивы и т. д., которые создаются во время выполнения. Данная область создается при запуске JVM и уничтожается при завершении ее работы.
Method area — область метода — эта область времени выполнения общая для всех потоков и создается при запуске JVM. Он хранит структуры для каждого класса, такие как пул констант (Runtime Constant Pool — пул для хранения констант), код для конструкторов и методов, данные метода и т. д.
13. Что такое immutable object?
14. В чем особенность класса String?
Это самый популярный объект в Java, который применяют для разнообразных целей. По частоте использования он не уступает даже примитивным типам.
String — это immutable класс: при создании объекта данного класса его данные нельзя изменить (когда вы к некоторой строке добавляете + “другую строку”, как результат вы получите новую, третью строку). Неизменность класса String делает его потокобезопасным.
Класс String финализирован (имеет модификатор final ), поэтому его наследование невозможно.
У String есть свой пул строк, область памяти в heap, которая кеширует создаваемые строковые значения. В этой части серии, в 62 вопросе, я описывал строковой пул.
The JVM spec defines certain run-time data areas that are needed during the execution of the program. Some of them are created while the JVM starts up. Others are local to threads and are created only when a thread is created (and destroyed when the thread is destroyed). These are listed below −
PC (Program Counter) Register
It is local to each thread and contains the address of the JVM instruction that the thread is currently executing.
Stack
It is local to each thread and stores parameters, local variables and return addresses during method calls. A StackOverflow error can occur if a thread demands more stack space than is permitted. If the stack is dynamically expandable, it can still throw OutOfMemoryError.
It is shared among all the threads and contains objects, classes’ metadata, arrays, etc., that are created during run-time. It is created when the JVM starts and is destroyed when the JVM shuts down. You can control the amount of heap your JVM demands from the OS using certain flags (more on this later). Care has to be taken not to demand too less or too much of the memory, as it has important performance implications. Further, the GC manages this space and continually removes dead objects to free up the space.
Method Area
This run-time area is common to all threads and is created when the JVM starts up. It stores per-class structures such as the constant pool (more on this later), the code for constructors and methods, method data, etc. The JLS does not specify if this area needs to be garbage collected, and hence, implementations of the JVM may choose to ignore GC. Further, this may or may not expand as per the application’s needs. The JLS does not mandate anything with regard to this.
Run-Time Constant Pool
The JVM maintains a per-class/per-type data structure that acts as the symbol table (one of its many roles) while linking the loaded classes.
Native Method Stacks
When a thread invokes a native method, it enters a new world in which the structures and security restrictions of the Java virtual machine no longer hamper its freedom. A native method can likely access the runtime data areas of the virtual machine (it depends upon the native method interface), but can also do anything else it wants.
Garbage Collection
The JVM manages the entire lifecycle of objects in Java. Once an object is created, the developer need not worry about it anymore. In case the object becomes dead (that is, there is no reference to it anymore), it is ejected from the heap by the GC using one of the many algorithms – serial GC, CMS, G1, etc.
During the GC process, objects are moved in memory. Hence, those objects are not usable while the process is going on. The entire application has to be stopped for the duration of the process. Such pauses are called ‘stop-the-world’ pauses and are a huge overhead. GC algorithms aim primarily to reduce this time. We shall discuss this in great detail in the following chapters.
Thanks to the GC, memory leaks are very rare in Java, but they can happen. We will see in the later chapters how to create a memory leak in Java.
jvm runtime data memory
2.5 Run-Time Data Areas
The Java Virtual Machine defines various run-time data areas that are used during execution of a program. Some of these data areas are created on Java Virtual Machine start-up and are destroyed only when the Java Virtual Machine exits. Other data areas are per thread. Per-thread data areas are created when a thread is created and destroyed when the thread exits. 11
2.5.1 The pc Register
The Java Virtual Machine can support many threads of execution at once (JLS §17). Each Java Virtual Machine thread has its own pc (program counter) register. At any point, each Java Virtual Machine thread is executing the code of a single method,namely the current method (§2.6) for that thread.If that method is not native, the pc register contains the address of the Java Virtual Machine instruction currently being executed. If the method currently being executed by the thread is native, the value of the Java Virtual Machine’s pc register is undefined. The Java Virtual Machine’s pc register is wide enough to hold a returnAddress or a native pointer onthe specific platform.
2.5.2 Java Virtual Machine Stacks
In the First Edition of The Java® Virtual Machine Specification, the Java Virtual Machinestack was known as the Java stack.
This specification permits Java Virtual Machine stacks either to be of a fixed size or to dynamically expand and contract as required by the computation. If the Java Virtual Machine stacks are of a fixed size, the size of each Java Virtual Machine stack may be chosen independently when that stack is created.
A Java Virtual Machine implementation may provide the programmer or the user control over the initial size of Java Virtual Machine stacks, as well as, in the case of dynamically expanding or contracting Java Virtual Machine stacks, control over the maximum and minimum sizes.
The following exceptional conditions are associated with Java Virtual Machine stacks:
• If the computation in a thread requires a larger Java Virtual Machine stack than is permitted, the Java Virtual Machine throws a StackOverflowError.
• If Java VirtualMachine stacks can be dynamically expanded, and expansion is attempted but insufficient memory can be made available to effect the expansion,or if insufficient memory can be made available to create the initial Java Virtual Machine stack for a new thread, the Java Virtual Machine throws an OutOfMemoryError.
2.5.3 Heap
The Java Virtual Machine has a heap that is shared among all Java Virtual Machine threads.The heap is the run-time data area from which memory for all class instances and arrays is allocated. The heap is created on virtual machine start-up. Heap storage for objects is reclaimed by an automatic storage management system (known as a garbage collector); objects are never explicitly deallocated. The Java Virtual Machine assumes no particular type of automatic storage management system, and the storage management technique may be chosen according to the implementor’s system requirements. The heap may be of a fixed size or may be expanded as required by the computation and may be contracted if a larger heap becomes unnecessary. The memory for the heap does not need to be contiguous.
A Java Virtual Machine implementation may provide the programmer or the user control over the initial size of the heap, as well as, if the heap can be dynamically expanded or contracted,control over the maximum and minimum heap size.
The following exceptional condition is associated with the heap:
• If a computation requires more heap than can be made available by the automatic storage management system, the Java Virtual Machine throws an OutOfMemoryError.
2.5.4 Method Area
The Java Virtual Machine has a method area that is shared among all Java Virtual Machine threads. The method area is analogous to the storage area for compiled code of a conventional language or analogous to the “text” segment in an operating system process. It stores per-class structures such as the run-time constant pool, field and method data, and the code for methods and constructors, including the special methods used in class and interface initialization and in instance initialization (§2.9).
The method area is created on virtual machine start-up. Although the method area is logically part of the heap, simple implementations may choose not to either garbage collect or compact it. This specification does not mandate the location of the method area or the policies used to manage compiled code. The method area may be of a fixed size or may be expanded as required by the computation and may be contracted if a larger method area becomes unnecessary. The memory for the method area does not need to be contiguous.
A Java Virtual Machine implementation may provide the programmer or the user control over the initial size of the method area, as well as, in the case of a varying-size method area, control over the maximum and minimum method area size.
The following exceptional condition is associated with the method area:
• If memory in the method area cannot be made available to satisfy an allocation request, the Java Virtual Machine throws an OutOfMemoryError.
2.5.5 Run-Time Constant Pool
A run-time constant pool is a per-class or per-interface run-time representation of the constant_pool table in a class file (§4.4). It contains several kinds of constants, ranging from numeric literals known at compile-time to method and field references that must be resolved at run-time. The run-time constant pool serves a function similar to that of a symbol table for a conventional programming language, although it contains a wider range of data than a typical symbol table.
Each run-time constant pool is allocated from the Java Virtual Machine’s method area (§2.5.4). The run-time constant pool for a class or interface is constructed when the class or interface is created (§5.3) by the Java Virtual Machine.
The following exceptional condition is associated with the construction of the runtime constant pool for a class or interface:
• When creating a class or interface, if the construction of the run-time constant pool requires more memory than can be made available in the method area of the Java Virtual Machine, the Java Virtual Machine throws an OutOfMemoryError.
See §5 (Loading, Linking, and Initializing) for information about the construction of the run-time constant pool.
2.5.6 Native Method Stacks
An implementation of the Java Virtual Machine may use conventional stacks, colloquially called “C stacks,” to support native methods (methods written in a language other than the Java programming language). Native method stacks may also be used by the implementation of an interpreter for the Java Virtual Machine’s instruction set in a language such as C. Java Virtual Machine implementations that cannot load native methods and that do not themselves rely on conventional stacks need not supply native method stacks. If supplied, native method stacks are typically allocated per thread when each thread is created.
This specification permits native method stacks either to be of a fixed size or to dynamically expand and contract as required by the computation. If the native method stacks are of a fixed size, the size of each native method stack may be chosen independently when that stack is created.
A Java Virtual Machine implementation may provide the programmer or the user control over the initial size of the native method stacks, as well as, in the case of varying-size native method stacks, control over the maximum and minimum method stack sizes.
The following exceptional conditions are associated with native method stacks:
• If the computation in a thread requires a larger native method stack than is permitted, the Java Virtual Machine throws a StackOverflowError.
• If native method stacks can be dynamically expanded and native method stack expansion is attempted but insufficient memory can be made available, or if insufficient memory can be made available to create the initial native method stack for a new thread, the Java Virtual Machine throws an OutOfMemoryError.
to sum up:
According to the instructions in the jvms specification.
java memory area and object allocation at runtime.
The area is divided into
It should be noted that the method area is a logical concept. For the implementation of method area in jvm, it is called Perm Space before version 1.8, and it is called Meta Space after 1.8.
The main difference between the two is:
1. PermSpace stores string constants; MetaSpace does not store string constants
2. PermSpace FGC will not clean up; MetaSpace will be cleaned up
3. The space size of PermSpace is specified when it is started and cannot be changed; if MetaSpace is not set, the maximum is physical memory
A copy of this summary drawing is for reference only:
JVM Run-Time Data Areas
The following are my notes of reading JVM specifications.
1. Data Areas for Each Individual Thread (not shared)
Data Areas for each individual thread include program counter register, JVM Stack, and Native Method Stack. They are all created when a new thread is created.
Program Counter Register is used to control each execution of each thread.
JVM Stack contains frames which is demonstrated in the diagram below.
Native Method Stack is used to support native methods, i.e., non-Java language methods.
2. Data Areas Shared by All Threads
All threads share Heap and Method Area.
Heap is the area that we most frequently deal with. It stores arrays and objects, created when JVM starts up. Garbage Collection works in this area.
Method Area stores run-time constant pool, field and method data, and methods and constructors code.
Runtime Constant Pool is a per-class or per-interface run-time representation of the constant_pool table in a class file. It contains several kinds of constants, ranging from numeric literals known at compile-time to method and field references that must be resolved at run-time.
Stack contains Frames, and a frame is pushed to the stack when a method is invoked. A frame contains local variable array, Operand Stack, Reference to Constant Pool.
For more information, please go to the offical JVM specification site.
JVM Runtime Data Area-runtime data area
During the execution of the Java program, the Java virtual machine divides the memory it manages into several different data areas. Each of these areas has its own purpose and creation and destruction time. Some areas follow the process of the virtual machine. It always exists at startup, and some areas are created and destroyed depending on the startup and termination of user threads.
1. Program counter (emphasis)
The Program Counter Register is a small memory space, which can be seen as a line number indicator of the bytecode executed by the current thread
When the bytecode interpreter works, it selects the next bytecode instruction to be executed by changing the value of this counter. It is an indicator of program control flow. Basic functions such as branch, loop, jump, exception handling thread recovery, etc. Need to rely on this program counter to complete
CPU is the execution mechanism of multi-thread switching. In order to ensure that threads can be restored to the correct execution position after thread switching, each thread needs to have a program counter independently, and the counters between each thread do not affect each other and are stored independently. We call this type of memory Is «thread-private memory»
The operation of the virtual machine is similar to this loop:
Take the position in the PC and find the instruction corresponding to the position;
Execute the instruction;
2. Java virtual machine stack-JVM Stack (emphasis)
1. The stack-based execution process of the program
Java Virtual Machine Stack-thread private, life cycle is the same as thread
The virtual machine stack describes the thread memory model of Java method execution: when each method is executed, the Java virtual machine creates a stack frame (Stack Frame) for storing local variable tables, operand stacks, dynamic connections, and methods Export and other information
Each thread corresponds to a JVM Stack, and each method in the thread corresponds to a Stack Frame.
The process from when a method is called to the completion of execution corresponds to the process of stack frames in the virtual machine stack (JVM Stack) from stacking to stacking
The operation of the program is actually the continuous execution of the method, and the execution and invocation of the method are actually the continuous pushing and popping of the stack frame.
2. Stack Frame-Stack Frame
The JVM uses methods as the most basic execution unit, and the «stack frame» is the data structure used to support the virtual machine for method calls and method execution. It is also the stack element of the virtual machine stack in the virtual machine running data area. In short, each method corresponds to a stack frame
The stack frame can be divided into the following four parts:
1. Local Variable Table (important)
Local Variable Table——local variable table, equivalent to register
We usually talk about «heap», «stack», «stack» usually refers to the virtual machine stack, or more often refers to the local variable table part of the virtual machine stack
The local variable table stores 8 data types boolean, byte, char, short, int, float, reference type (object reference) and returnAddres (pointing to the location of a bytecode instruction)
The first 6 kinds are well understood, here we need to focus on the seventh kind-reference type
The reference type represents a reference to an object instance, and the JVM can do two things through this reference:
1. Find the starting position or index of the object’s data storage in Heap directly or indirectly according to the reference
2. Directly or indirectly find the type information stored in the method area of the object’s data type according to the reference
Summary: The local variable table is a storage space for a set of variable values, which is used to store method parameters and local variables defined inside the method. When the method is called, the Java virtual machine uses the local variable table to complete the transfer of parameter values to the parameter variable list Process, that is, the transfer of actual parameters to formal parameters
2、Operand Stack
Operand Stack-operand stack
When a method starts to execute, the operand stack of this method is empty. During the execution of the method, there will be various bytecode instructions to write and read the contents of the operand stack, that is, push and Pop operation
For example, the bytecode instruction iadd for adding integers. When this instruction runs, it requires that the two elements closest to the top of the operand stack have been stored in two int type values. When this instruction is executed, the two An int value is popped and added, and then the result of the addition is put on the stack
The type of elements in the operand stack must strictly match the sequence of bytecode instructions. When compiling the program code, the compiler must strictly guarantee this, and it must be verified again in the class verification stage.
For example, in the iadd instruction above, if the top two elements of the operand stack are not all of type int, and one of them is of type long, they cannot be added and an error will occur.
Summary: The interpretative execution engine of the Java virtual machine is called the «stack-based execution engine», and the «stack» inside is the operand stack
3、Dynamic Linking
Dynamic Linking——Dynamic Linking
Each stack frame contains a reference to the method to which the stack frame belongs in the runtime constant pool. This reference is held to support dynamic connection during method invocation
We already know that there are a large number of symbol references in the constant pool of the Class file, and the method call instruction in the bytecode takes the symbol reference to the method in the constant pool as a parameter. Some of these symbol references are converted to direct references during the class loading stage or when they are first used. This conversion is called static resolution.The other part will be converted into direct reference during each run, this part is called dynamic link
Reference: Chapter 8 of «In-depth Understanding of the Java Virtual Machine»
4、return address
return address-method return address
When a method starts to execute, there are only two ways to exit the method:
1. Normal call completed
That is, the normal end, the execution engine encounters the bytecode instruction indicating the method return, whether there is a return value is determined according to the specific method
2. Abnormal call completed
That is, an exception is encountered during the execution, and the exception is not properly handled in the method body, which will cause the method to exit
No matter what method is used to exit, after the method exits, it must return to the position when the original method was called before the program can continue to run. When the method returns, some information may need to be saved in the stack frame to help restore its upper master. Adjust the execution status of the method
Each method corresponds to a stack frame, the execution of the method corresponds to the push of the stack frame, and the exit of the method corresponds to the pop of the stack frame
When each method is executed, various parameters and variables are stored inLocal variable table, The executed bytecode is stored inOperand stackin
Third, the local method stack
Native Method Stack-Native Method Stack
The virtual machine stack serves for the virtual machine to execute Java methods (that is, bytecode), and the native method stack serves for the native methods used by the virtual machine.
Because it is automatically managed by JVM, it is difficult for us to intervene manually.
Four, Java heap-Heap (emphasis)
For Java applications, the Java heap is the largest piece of memory managed by the virtual machine. It is a memory area shared by all threads and is created when the virtual machine starts.The sole purpose of this memory area is to store object instances, and «almost» all object instances in Java allocate memory here
The Java heap is a memory area managed by the garbage collector (GC), so it is also called the «GC heap». From the perspective of reclaiming memory, because the industry’s absolute mainstreamHotspot virtual machine garbage collectors are all based on generational collection theory, So ** Java heap is often expressed as «new generation», «old generation», «Eden area» ** and other nouns
Since Heap is shared by all threads, when there is high concurrency, multiple threads may want to use the same memory block at the same time, causing thread contention and low efficiency. So from the perspective of memory allocation, all threads shareHeap memory (in more detail is actually the Eden area in Heap) can be divided into each thread private allocation buffer (Thread Local Allocation Buffer, referred to as TLAB), To reduce thread contention to improve the efficiency of object allocation
No matter what the division, it will not change the commonality of the contents stored in the Java heap——No matter which region, only instances of objects can be stored, The purpose of subdividing the Java heap is just toGC (garbage collector) better reclaims memory, and allocates memory faster
Try to allocate on the stack-«TLAB allocation-«allocation on the heap (new generation, old generation)
5. Method Area-Method Area (Key)
1. Concept of method area
Method Area——The method area is shared by all threads like the Java heap. It is used to store data such as type information, constants, static variables, and just-in-time compiled code caches that have been loaded by the virtual machine
The method area is a logical concept, divided from the time axis, there are two specific implementations:
1. Before JDK1.8, the method area was implemented as Perm Space, which is a permanent area
Compare the questions that an interviewer might ask
2. After JDK1.8, the name of the method area is Meta Space, which is the metadata area
2. Runtime constant pool
Runtime Constant Pool-runtime constant pool, is part of the method area
We know in the previous study that there is a piece of information in the Class file structure is the constant pool table (store literals and symbol references generated by the compiler (static concept)), after running (dynamic concept), this part of the content is completed in the class loading It is then stored in the runtime constant pool in the method area.
In addition to saving the symbol references described in the Class file, the runtime constant pool also stores the direct references translated from the symbol references in the runtime constant pool.
Seven, direct memory-Direct Memory
Direct memory is not part of the data area of the virtual machine during runtime, but it is also considered as memory managed by the JVM, and this part of memory is frequently used, which may also lead to OOM, which is closely related to the runtime data area, so I will explain it here.
In order to increase IO efficiency, direct memory (Direct Memory) was added after JDK1.4,The memory managed by the operating system can be directly accessed from within the JVM, that is, the user space can directly access the kernel space
After JDK1.4, the NIO class was introduced, and the I/O method based on the channel (Channel) and the buffer (Buffer) was introduced. It can use the Native function library to directly allocate off-heap memory, and then store it in the Java heap (Heap) through a The DirectByteBuffer object in it operates as a reference to this memory. This avoids copying data back and forth between the Java heap (Heap) and Native heap (ie zero copy)
Supplement, common instructions
bipush: expand the byte type to int type, and then push to the operand stack (b(byte)i(int)push), that is, push the basic data type onto the stack
istore_n: pop the top element of the operand stack, and then assign it to the variable whose subscript value is n in the local variable table
iload_n: Put the variable value at position n in the local variable table on the top of the operand stack, that is, push the stack
For details of each bytecode instruction, refer to the official document
Refer to the detailed execution process of a program translated into bytecode instruction set: «In-depth understanding of the Java virtual machine» P330