JVM: Class File Format - Structure
Class File
Each .class file contains the definition of a single:
- class
- interface or
- a module
Class format (the specific order and meaning of bytes) described in the spec not neceserily applies to an actual file on the filesystem: those “bytes” might have been read from the database, downloaded from the web or constructed by a class loader; keep this in mind whenever you encounter the phrase “class file format”.
Class file is:
- a stream of 8-bit bytes
- 16-bit and 32-bit values are contructed reading two or four consecutive bytes
- spec describes the format using pseudo-structures (C-like notation)
- contents of those structures are called items
- items are stored sequentially without padding
- some structures contain tables with variabe sized items (tables cannot be indexed by a byte offset)
- arrays are structures with fixed sized, contiguous items and can be indexed like an array
The Top Level Structure
Here’s the structure in pseudo-code:
|
|
Magic
Four bytes that identify class file format are 0xCAFEBABE
.
Let’s write CheckClass.java
which writes the contents of first four bytes of .class file given as commandline parameter (beware: no error checking):
|
|
Versions
Items minor_version
and major_version
denote the version of the class file format and are used in a form M.m
where M
is major_version
and m
is minor_version
.
major_version
: which release of Java JRE this class was build with; this JRE would accept (i.e. run) all class files with amajor_version
value smaller or equal to this class’smajor_version
minor_version
: valid values are:- for
major_version
45..55 incl. - any value - for 56.. : (56.. means: Java 12 or higher)
0
if no preview features are used or65535
if preview features are used Here “are used” means: the class file depends of specific preview features.
- for
Preview features, enabled since Java 12, allow testers/early adopters see how specific language features behave in real life and encourage them to give some feedback. They can be used only if specifically enabled in JVM (--enable-preview
).
Preview features handling
An interesting feature of JVM spec is that it defines proper behavior of JVM implementation in case .class file uses preview features: in such case a class can be loaded only if
- the preview features are enabled in jvm and
- the .class file does not depend on any preview features from other releases
If my code didn’t use any preview features of any release and was compiled with Java 12 JDK (i.e. my .class version is 56.0), I can run it on Java 12 JRE, Java 13 JRE, Java 14 JRE… etc.
However, if I compiled source code that used preview features of Java 12 with Java 12 JDK (i.e. my .class version is 56.65535) , I can run it only on Java 12 JRE and not on Java 13 JRE.
Example:
I’m currently using Java 17 which comes with pattern matching for switch
still in preview
phase.
Here’s the begining of a hex dump of CheckClass.java
(no preview features):
|
|
And here I use pattern matching for switch. So I change CheckClass.java
and add some stupid code (that uses condition in switch case) in getClassFileInputStream only to make the compiler treat this file as the one which is using preview features:
|
|
If I compile and execute it with proper flag:
|
|
I get .class file that looks like this:
|
|
Well, yes, the difference is obvious.
Have a look at bytes 5, 6 (minor) and 7, 8 (major version) in both .classfile hex snippets.
The use of preview feature (switch with condition - used only to illustrate the compiled class file version change, without merit) made version change from 00 00 00 3D
(61.0) to FF FF 00 3D
(61.65535).
This change would prevent this .class to be loaded by newer (spec-compliant) JREs.
Constant Pool
Version number(s) are followed by constant pool count (which is one higher than number of entries in constant pool table) and constant pool table.
In the above example the size of constant pool is 00 AA
(170) which means there are 169 constant pool entries.
javap -v CheckClass.class
confirms that:
|
|
Access flags
Next field is access flag: it is a bit mask of flags which have following value and meaning:
Flag | Value | Description |
---|---|---|
ACC_PUBLIC | 0x0001 | Declared public; may be accessed from outside its package. |
ACC_FINAL | 0x0010 | Declared final; no subclasses allowed. |
ACC_SUPER | 0x0020 | Treat superclass methods specially when invoked by the invokespecial instruction. |
ACC_INTERFACE | 0x0200 | Is an interface, not a class. |
ACC_ABSTRACT | 0x0400 | Declared abstract; must not be instantiated. |
ACC_SYNTHETIC | 0x1000 | Declared synthetic; not present in the source code. |
ACC_ANNOTATION | 0x2000 | Declared as an annotation interface. |
ACC_ENUM | 0x4000 | Declared as an enum class. |
ACC_MODULE | 0x8000 | Is a module, not a class or interface. |
The spec defines condition when each flag must, might or cannot be set (flags must be valid with regard to Java language semantics).
This Class
Index into constant pool table that is CONSTANT_Class_info
structure (one byte tag
value denoting what sits in this constant pool table slot and two-byte name_index
value pointing to CONSTANT_Utf8_info
structure inside the constant pool).
Super Class
It has value 0 (zero) or is an index into constant pool table with CONSTANT_Class_info
structure (denoting the super class).
- Zero if it is Object class or an interface without super interfaces
- non-zero means that no super class may have ACC_FINAL set as acces flag
interfaces_count and interfaces[]
Number of direct interfaces and a table of indices into constant pool table with CONSTANT_Class_info
; the order of inerfaces is the same as in source file.
Fields, methods, attributes
Those three sections have following meaning:
- fields_count and fields[] - number of fields (static and instance) and table with indices into constant pool table with
field_info
structures. - methods_count and methods[] - number of
method_info
structures and a list of indices into constant pool table with those structures. The table contains instance methods, class methods, constructors. - attribute_count and attributes[]- number and table with indices into constant pool table with
attribute_info
structures. They are described in JVM: Fields, Methods, Attributes post.
A note about compiled module
Module is compiled from module-info.java (which is not a valid java source file name) and its compiled representation (.class file) should have:
- only ACC_MODULE flag set
- major_version, minor_version: ≥ 53.0 (Java 9)
- this_class: module-info
- super_class, interfaces_count, fields_count, methods_count: zero
- one Module attribute is mandatory
- some of these attributes are possible: ModulePackages, ModuleMainClass, InnerClasses, SourceFile, SourceDebugExtension, RuntimeVisibleAnnotations, and RuntimeInvisibleAnnotations
That’s all for today. I haven’t described the specifics about constant-pool structures representing fields, methods and attributes yet.
This is part of chapter 4.4 and I’ll write about it later.
In the next post I’ll describe shortly the rules of class and interface names (chapter 4.2). Stay tuned!
Ten wpis jest częścią serii jvm.
- 2022-14-02 - JVM: Loading, Linking, Initializing
- 2022-11-02 - JVM: Verification and Checks
- 2022-06-02 - JVM: Fields, Methods, Attributes
- 2022-02-02 - JVM: The Constant Pool
- 2022-26-01 - JVM: Names and Descriptors
- 2022-22-01 - JVM: Class File Format - Structure
- 2022-16-01 - JVM: Compiling for the JVM
- 2022-08-01 - JVM: Instruction Set Summary
- 2022-07-01 - JVM: Structure of the JVM
- 2022-04-01 - JVM: Introduction