JVM: The Constant Pool

It’s snowy, windy and dark outside. It is a perfect weather to read a technical spec, isn’t it?
Let’s read about a .classfile’s constant pool today.
Section 4.4 lists and describes all the structures that form the constant pool. As you might remember from JVM: Class File Format - Structure, the constant pool data are placed right after .classfile’s version information:
|
|
What’s important:
- JVM instructions don’t know the real run-time layout (i.e. location in process’ memory) of classes, interfaces or arrays
- each instruction rely only on information from constant pool
The Format
Each entry in constant pool table is of this format:
|
|
The one-byte tag determines what constant it is and must be followed by two or more bytes that give information about specific constant.
The additional data in info
array depend on the tag byte.
The Constants
Here’s a list of constant kinds, tags, version of class file format and numerical Java SE version which introduced it.
Attrubute “L?” shows if constants of this type are “loadable” - some constants may be pushed on stack at runtime for further processing.
Constant Kind | tag | class ver | Java SE | L? |
---|---|---|---|---|
CONSTANT_Utf8 | 1 | 45.3 | 1.0.2 | |
CONSTANT_Integer | 3 | 45.3 | 1.0.2 | ✓ |
CONSTANT_Float | 4 | 45.3 | 1.0.2 | ✓ |
CONSTANT_Long | 5 | 45.3 | 1.0.2 | ✓ |
CONSTANT_Double | 6 | 45.3 | 1.0.2 | ✓ |
CONSTANT_Class | 7 | 45.3 | 1.0.2 | ✓ |
CONSTANT_String | 8 | 45.3 | 1.0.2 | ✓ |
CONSTANT_Fieldref | 9 | 45.3 | 1.0.2 | |
CONSTANT_Methodref | 10 | 45.3 | 1.0.2 | |
CONSTANT_InterfaceMethodref | 11 | 45.3 | 1.0.2 | |
CONSTANT_NameAndType | 12 | 45.3 | 1.0.2 | |
CONSTANT_MethodHandle | 15 | 51.0 | 7 | ✓ |
CONSTANT_MethodType | 16 | 51.0 | 7 | ✓ |
CONSTANT_Dynamic | 17 | 55.0 | 11 | ✓ |
CONSTANT_InvokeDynamic | 18 | 51.0 | 7 | |
CONSTANT_Module | 19 | 53.0 | 9 | |
CONSTANT_Package | 20 | 53.0 | 9 |
The Structures
The JVM Specification describes structures containing constant data in a form of pseudocode structs.
- The semantics of same-named items of different structs are similar in each structure (e.g. class_index means: the index into constant pool table)
- but might differ in the data that such item “accepts” as valid (i.e. class_index item in CONSTANT_Methodref_info must be a valid index info a constant_pool entry and that entry must be a CONSTANT_Class_info structure).
For the detailed description of each structure please see the JVM spec, specifically The Constant Pool
Example
The knowledge of the constant_pool layout and the tag values let me identify the constants, at least in simple class files.
Let’s have a look at a .class file compiled from Constant.java
:
|
|
Here’s the .class contents in hexedit
and constants reported by javap
:
So the size of the pool is 0x12 (18), which means there are 17 actual entries (there is one less entries than the size of the pool).
The first constantpool entry has a tag 0x0A (10) which is CONSTANT_Methodref
and this tag is followed by two bytes:
- class_index of values: 0x0002
- name_and_type_index of value 0x0003 So, to know what class this entry refers to, we need to parse the second and third entry.
The second constant pool entry has a tag 0x07 (7) which is CONSTANT_Class
- as expected - and next two bytes (0x0004) are index to constant pool where the CONSTANT_Utf8_info
sits.
The third has tag 0x0C (12) denoting CONSTANT_NameAndType
and following bytes show name_index (0x0005) and descriptor_index (0x0006).
The fourth has tag 0x01 (utf string), length is 0x0010 (16) and next 16 bytes are the “meat” of the string java/lang/Object
:

The fifth is an utf8 string (tag: 0x01) of length 6 and value <init>
.
The sixth is an utf8 string (tag: 0x01) of length 3 and value ()V
So, what we decoded so far is the reference to java.lang.Object constructor.
The seventh (tag 0x09) is a CONSTANT_Fieldref
so what follows is class_index (0x08) and name_and_type_index (0x09).
The eighth (tag: 0x07) is CONSTANT_Class
which points to 10th element of the constant pool.
… and so on. Continuing step by step woudld allow to find out what the constants are, with lots of indirection, and finally get the whole list of 17 constants (nicely layed out by javap -v
with idices)
My Findings
My notes - sometimes just quotes from the spec:
Int and Float
Those are CONSTANT_Float_info
and CONSTANT_Integer_info
structures of same format
|
|
The values are stored in big endian (high byte first). In order to calculate the actual value, those bytes are first interpreted as int value.
In case of float I found the value of infinity :)
- the positive infinity is 0x7f800000
- the negative infinity is 0xff800000
And the Nan is
- every value in range (0x7f800001, 0x7fffffff) or (0xff800001, 0xffffffff)
The value calculation is
|
|
And the resulting float is the value of $$s · m · 2^{e-150}$$
Long and Double
These are 8 bytes numeric values and take up two entries in constant pool table (this design decision - as admitted in the spec - was a poor choice) which means that values of the constant pool index that would “point into” the long/double entry are not allowed.
Here the calculation of value is quite similar to the calculation of float value. The bits (converted to long value first as ((long) high_bytes << 32) + low_bytes
) are checked if are (+ or -) infinity, or NaN, and then calucated:
|
|
as $$s · m · 2^{e-1075}$$
Fields, Methods and Interface Methods
Methods and inteface methods are represented with
- CONSTANT_Fieldref_info (tag: 9)
- CONSTANT_Methodref_info (tag: 10)
- CONSTANT_InterfaceMethodref_info (tag: 11) which have similar structure:
|
|
Here class_index points to a constant_pool index of the CONSTANT_Class_info
and name_and_type_index points to CONSTANT_NameAndType_info
.
Restrictions apply:
- class_index must point to a class, and not interface in case of
CONSTANT_Methodref_info
, - class_index must point to an interface, and not class in case of
CONSTANT_InterfaceMethodref_info
, - name_and_type_index must point to a constant that represents field in case of
CONSTANT_Fieldref_info
- name_and_type_index must point to a constant that represents method in case of
CONSTANT_Methodref_info
orCONSTANT_InterfaceMethodref_info
Field or method
Those are represented as structures:
|
|
with unqualified name (name_index) and descriptor (descriptor_index) (see my previous Descriptors post)- as indices into the constant pool.
What’s interesting, no information is available about what class they come from.
Method Type
This is represented as CONSTANT_MethodType_info
|
|
This constant represents a type of the method and “points” into a pool where the appropriate descriptor is placed.
String constant
In the structure
|
|
the length is given in bytes and the bytes represent a string in modified UTF-8 strings which are non null-terminated
- the null character 0 is endoded using 2-byte format - strings never have embedded nulls
- non-null ASCII characters are represented using only 1 byte per codepoint
- inly 1, 2, 3-byte formats of standard UTF-8 is used; JVM does not recognize 4-byte format
- JVM uses its own two-times-three format (e.g. for encoding supplementary charactes above U+FFFF) - each of the two surrogate coede units are represented by three bytes.
Method Handle
|
|
This is CONSTANT_MethodHandle_info
struct which is a bit complex:
- next to the tag (of value 15) there is
- an item reference_kind of value in a range 1 to 9
- this value tells what should be in the constant pool under reference_index
- represents the kind of bytecode behavior
values | behavior |
---|---|
1, 2, 3, 4 | CONSTANT_Fieldref_info for which setter/setter is to be created |
5 or 8 | CONSTANT_Methodref_info or constructor for which handle is to be created |
6, 7 | CONSTANT_Methodref_info or CONSTANT_InterfaceMethodref_info |
9 | CONSTANT_InterfaceMethodref_info |
So this constant represents a kind of a method pointer (hence handle).
Dynamic
Most structures in constant_pool represent “entities”: methods, fields or constants - directly. But there is a way to dynamically generate such representation. This is accomplished by two constants:
CONSTANT_Dynamic_info
- represents dynamically computed constant by bootsrap method called during processing of ldc instructionCONSTANT_InvokeDynamic_info
- represents a call site (java.lang.invoke.CallSite) produced by invocation of a bootstrap method during processing invokedynamic instruction
These constants represent an indirect way of getting direct information about “entities”. Interesting. When is this needed? How is this used? How inviokedynamic works? A good topic for a blog post, I guess.
Package and Module
Two last constants - existing only in class files representing - or generated from - a package and a module descriptor: CONSTANT_Package_info
and CONSTANT_Module_info
Summary
This was a short and easy read through types of constants in a constant_pool.
Hopefully I won’t need to bit-fiddle in order to read float or put together bit patterns to recognize UTF-8 strings (I would probably use DataInputStream or ASM library for reading .class file) in the rest of my career, but who knows?
Next sections of .class file spec will cover fields and methods. And then attributes. Then three sections left in order to complete chapter 4:
- format checking
- constraints
- verification
And I’ll be ready to start most mysterious part of the spec: Loading, linking and initializing
See you next time!
Interesting resources
Some links I found recently:
- JRebels java bytecode - using objects and calling methods
- JRebel’s java bytecode tutorial which led me to …
- ASM library and its documentation
Ten wpis jest częścią serii jvm.
- 2022-14-02 - JVM: Loading, Linking, Initializing
- 2022-11-02 - JVM: Verification and Checks
- 2022-06-02 - JVM: Fields, Methods, Attributes
- 2022-02-02 - JVM: The Constant Pool
- 2022-26-01 - JVM: Names and Descriptors
- 2022-22-01 - JVM: Class File Format - Structure
- 2022-16-01 - JVM: Compiling for the JVM
- 2022-08-01 - JVM: Instruction Set Summary
- 2022-07-01 - JVM: Structure of the JVM
- 2022-04-01 - JVM: Introduction