Contents

JVM: Class File Format - Structure

Class File

Each .class file contains the definition of a single:

  • class
  • interface or
  • a module

Class format (the specific order and meaning of bytes) described in the spec not neceserily applies to an actual file on the filesystem: those “bytes” might have been read from the database, downloaded from the web or constructed by a class loader; keep this in mind whenever you encounter the phrase “class file format”.

Class file is:

  • a stream of 8-bit bytes
  • 16-bit and 32-bit values are contructed reading two or four consecutive bytes
  • spec describes the format using pseudo-structures (C-like notation)
  • contents of those structures are called items
  • items are stored sequentially without padding
  • some structures contain tables with variabe sized items (tables cannot be indexed by a byte offset)
  • arrays are structures with fixed sized, contiguous items and can be indexed like an array

The Top Level Structure

Here’s the structure in pseudo-code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
ClassFile {
  u4 magic;
  u2 minor_version;
  u2 major_version;
  u2 constant_pool_count;
  cp_info constant_pool[constant_pool_count-1];
  u2 access_flags;
  u2 this_class;
  u2 super_class;
  u2 interfaces_count;
  u2 interfaces[interfaces_count];
  u2 fields_count;
  field_info fields[fields_count];
  u2 methods_count;
  method_info methods[methods_count];
  u2 attributes_count;
  attribute_info attributes[attributes_count];
}

Magic

Four bytes that identify class file format are 0xCAFEBABE.

Let’s write CheckClass.java which writes the contents of first four bytes of .class file given as commandline parameter (beware: no error checking):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.IOException;

class CheckClass {
    
    public static byte[] magic(String fname) throws IOException{
        try(DataInputStream os = new DataInputStream(new FileInputStream(fname))) {
            var ba = new byte[4];
            os.read(ba, 0, 4);
            return ba;
        }
    }
    
    public static void main(String[] args) throws IOException {
        var bs = magic(args[0]);
        for (var b : bs) {
            System.out.print(String.format("%x", b));
        }
    }
}

Versions

Items minor_version and major_version denote the version of the class file format and are used in a form M.m where M is major_version and m is minor_version.

  • major_version: which release of Java JRE this class was build with; this JRE would accept (i.e. run) all class files with a major_version value smaller or equal to this class’s major_version
  • minor_version: valid values are:
    • for major_version 45..55 incl. - any value
    • for 56.. : (56.. means: Java 12 or higher)
      • 0 if no preview features are used or
      • 65535 if preview features are used Here “are used” means: the class file depends of specific preview features.

Preview features, enabled since Java 12, allow testers/early adopters see how specific language features behave in real life and encourage them to give some feedback. They can be used only if specifically enabled in JVM (--enable-preview).

Preview features handling

An interesting feature of JVM spec is that it defines proper behavior of JVM implementation in case .class file uses preview features: in such case a class can be loaded only if

  • the preview features are enabled in jvm and
  • the .class file does not depend on any preview features from other releases

If my code didn’t use any preview features of any release and was compiled with Java 12 JDK (i.e. my .class version is 56.0), I can run it on Java 12 JRE, Java 13 JRE, Java 14 JRE… etc.

However, if I compiled source code that used preview features of Java 12 with Java 12 JDK (i.e. my .class version is 56.65535) , I can run it only on Java 12 JRE and not on Java 13 JRE.

Example:

I’m currently using Java 17 which comes with pattern matching for switch still in preview phase.

Here’s the begining of a hex dump of CheckClass.java (no preview features):

1
2
3
00000000   CA FE BA BE  00 00 00 3D  00 4B 0A 00  02 00 03 07  .......=.K......
00000010   00 04 0C 00  05 00 06 01  00 10 6A 61  76 61 2F 6C  ..........java/l
00000020   61 6E 67 2F  4F 62 6A 65  63 74 01 00  06 3C 69 6E  ang/Object...<in

And here I use pattern matching for switch. So I change CheckClass.java and add some stupid code (that uses condition in switch case) in getClassFileInputStream only to make the compiler treat this file as the one which is using preview features:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Optional;

class CheckClass {
    
    public static byte[] magic(InputStream is) throws IOException{
        try(DataInputStream os = new DataInputStream(is)) {
            var ba = new byte[4];
            os.read(ba, 0, 4);
            return ba;
        }
    }
    
    private static Optional<InputStream> getClassFileInputstream(String[] args) throws IOException {
        Optional<Path> pa = switch(args.length) {
            case 1 -> Optional.ofNullable(Paths.get(args[0]));
            default -> Optional.empty();
        };
        return switch(pa) {
            case Optional<Path> p && (p.isPresent() && p.get().toFile().exists())  -> Optional.of(new FileInputStream(p.get().toFile()));
            default -> Optional.empty();
        };
    }
        
    
    public static void main(String[] args) throws IOException {
        try(var ins = getClassFileInputstream(args)
                            .orElseThrow(()-> 
                                new IOException("Such file does not exist"))) { 
            var bs = magic(ins);
            for (var b : bs) {
                System.out.print(String.format("%x", b));
            }
        }
    }       
}

If I compile and execute it with proper flag:

1
2
3
4
5
[karma@tpd|~/d/j/jvm] javac --enable-preview --source 17 CheckClass.java && java --enable-preview CheckClass CheckClass.class
Note: CheckClass.java uses preview features of Java SE 17.
Note: Recompile with -Xlint:preview for details.
cafebabe⏎                                                                                                                                                              
[karma@tpd|~/d/j/jvm]

I get .class file that looks like this:

1
2
3
00000000   CA FE BA BE  FF FF 00 3D  00 AA 0A 00  02 00 03 07  .......=........
00000010   00 04 0C 00  05 00 06 01  00 10 6A 61  76 61 2F 6C  ..........java/l
00000020   61 6E 67 2F  4F 62 6A 65  63 74 01 00  06 3C 69 6E  ang/Object...<in

Well, yes, the difference is obvious.

Have a look at bytes 5, 6 (minor) and 7, 8 (major version) in both .classfile hex snippets.

The use of preview feature (switch with condition - used only to illustrate the compiled class file version change, without merit) made version change from 00 00 00 3D (61.0) to FF FF 00 3D (61.65535).

This change would prevent this .class to be loaded by newer (spec-compliant) JREs.

Constant Pool

Version number(s) are followed by constant pool count (which is one higher than number of entries in constant pool table) and constant pool table. In the above example the size of constant pool is 00 AA (170) which means there are 169 constant pool entries.

javap -v CheckClass.class confirms that:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
[karma@tpd|~/d/j/jvm] javap -v CheckClass.class
Classfile /home/karma/dev/java/jvm/CheckClass.class
  Last modified 24 sty 2022; size 3265 bytes
  SHA-256 checksum 09de46d96e700555b111c03172ccfa83297c4f33782154cd28039a42cf453565
  Compiled from "CheckClass.java"
class CheckClass
  minor version: 65535
  major version: 61
  flags: (0x0020) ACC_SUPER
  this_class: #80                         // CheckClass
  super_class: #2                         // java/lang/Object
  interfaces: 0, fields: 0, methods: 5, attributes: 3
Constant pool:
    #1 = Methodref          #2.#3         // java/lang/Object."<init>":()V
    #2 = Class              #4            // java/lang/Object
    #3 = NameAndType        #5:#6         // "<init>":()V
    #4 = Utf8               java/lang/Object
    #5 = Utf8               <init>
    #6 = Utf8               ()V
    #7 = Class              #8            // java/io/DataInputStream
    #8 = Utf8               java/io/DataInputStream
    #9 = Methodref          #7.#10        // java/io/DataInputStream."<init>":(Ljava/io/InputStream;)V
   #10 = NameAndType        #5:#11        // "<init>":(Ljava/io/InputStream;)V
   #11 = Utf8               (Ljava/io/InputStream;)V
   //--------------------------- (...) SNIP ---------------------------------------------- //
  #159 = MethodType         #59           //  ()Ljava/lang/Object;
  #160 = MethodHandle       6:#161        // REF_invokeStatic CheckClass.lambda$main$0:()Ljava/io/IOException;
  #161 = Methodref          #80.#162      // CheckClass.lambda$main$0:()Ljava/io/IOException;
  #162 = NameAndType        #141:#142     // lambda$main$0:()Ljava/io/IOException;
  #163 = MethodType         #142          //  ()Ljava/io/IOException;
  #164 = Utf8               InnerClasses
  #165 = Class              #166          // java/lang/invoke/MethodHandles$Lookup
  #166 = Utf8               java/lang/invoke/MethodHandles$Lookup
  #167 = Class              #168          // java/lang/invoke/MethodHandles
  #168 = Utf8               java/lang/invoke/MethodHandles
  #169 = Utf8               Lookup
{
  CheckClass();
    descriptor: ()V
    flags: (0x0000)
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: return
      LineNumberTable:
        line 11: 0

  public static byte[] magic(java.io.InputStream) throws java.io.IOException;
    descriptor: (Ljava/io/InputStream;)[B
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC
    Code:

Access flags

Next field is access flag: it is a bit mask of flags which have following value and meaning:

Flag Value Description
ACC_PUBLIC 0x0001 Declared public; may be accessed from outside its package.
ACC_FINAL 0x0010 Declared final; no subclasses allowed.
ACC_SUPER 0x0020 Treat superclass methods specially when invoked by the invokespecial instruction.
ACC_INTERFACE 0x0200 Is an interface, not a class.
ACC_ABSTRACT 0x0400 Declared abstract; must not be instantiated.
ACC_SYNTHETIC 0x1000 Declared synthetic; not present in the source code.
ACC_ANNOTATION 0x2000 Declared as an annotation interface.
ACC_ENUM 0x4000 Declared as an enum class.
ACC_MODULE 0x8000 Is a module, not a class or interface.

The spec defines condition when each flag must, might or cannot be set (flags must be valid with regard to Java language semantics).

This Class

Index into constant pool table that is CONSTANT_Class_info structure (one byte tag value denoting what sits in this constant pool table slot and two-byte name_index value pointing to CONSTANT_Utf8_info structure inside the constant pool).

Super Class

It has value 0 (zero) or is an index into constant pool table with CONSTANT_Class_info structure (denoting the super class).

  • Zero if it is Object class or an interface without super interfaces
  • non-zero means that no super class may have ACC_FINAL set as acces flag

interfaces_count and interfaces[]

Number of direct interfaces and a table of indices into constant pool table with CONSTANT_Class_info; the order of inerfaces is the same as in source file.

Fields, methods, attributes

Those three sections have following meaning:

  • fields_count and fields[] - number of fields (static and instance) and table with indices into constant pool table with field_info structures.
  • methods_count and methods[] - number of method_info structures and a list of indices into constant pool table with those structures. The table contains instance methods, class methods, constructors.
  • attribute_count and attributes[]- number and table with indices into constant pool table with attribute_info structures. They are described in JVM: Fields, Methods, Attributes post.

A note about compiled module

Module is compiled from module-info.java (which is not a valid java source file name) and its compiled representation (.class file) should have:

  • only ACC_MODULE flag set
  • major_version, minor_version: ≥ 53.0 (Java 9)
  • this_class: module-info
  • super_class, interfaces_count, fields_count, methods_count: zero
  • one Module attribute is mandatory
  • some of these attributes are possible: ModulePackages, ModuleMainClass, InnerClasses, SourceFile, SourceDebugExtension, RuntimeVisibleAnnotations, and RuntimeInvisibleAnnotations

That’s all for today. I haven’t described the specifics about constant-pool structures representing fields, methods and attributes yet.

This is part of chapter 4.4 and I’ll write about it later.

In the next post I’ll describe shortly the rules of class and interface names (chapter 4.2). Stay tuned!


Ten wpis jest częścią serii jvm.

Wszystkie wpisy w tej serii: