Contents

JVM: Verification and Checks

The weather is still “very february”, so I grab a cup coffee ☕, start reading and taking notes.

This post completes Chapter 4 of the JVM Specification - it quickly goes through format checking, shows two types of Code attribute checks and lists JVM limitations.

Format Checking

Before the class is loaded, JVM needs to ensure that the .class file has appropriate format. There are five basic integrity checks that are performed on a .class file:

  • the first four bytes must be CAFEBABE
  • all predefined attributes must be of proper length (this is not checked for StackMapTable, AnnotationDefault and Runtime(...)Annotations family of attributes)
  • class file must not be truncated or have extra bytes at the end
  • the constant pool must satisfy basic constaints (like: indices to constant_pool should point to proper type of constants)
  • all field and method references in constant pool must have valid names, classes and descriptors - where “valid” means “well-formed” and not “actually existing in the class”; proper checks are performed during bytecode verification.

This phase of checking is really basic and is distinct from bytecode verification.

(Right now my 3 yo daughter is drumming on my kitchen table with her wooden toys; my mental model of JVM loading and linking process has just been smashed to pieces… I need a while to pick them up!)

Constraints on JVM Code

The code of a method, instance or class initialization method is stored in the array of Code attribute of method_info structure of a class file (you can check my post on fields, methods and attributes)

Picture of iconic robot girl with magnifying glass

There are two types of constraints:

  • static constraints define well-formedness of the file; they specidy how the instructions must be laid our in the code array and what the operands of individual instructions must be; some examples:
    • the indexbyte operands of each invokevirtual instruction must point to constant_pool entry of kind CONSTANT_Methodref
    • only invokespecial instruction is allowed to invoke instance initialization method
    • the target of each jump and branch instruction must be the opcode of an instruction within this method
  • structural constraints specify relations between JVM instructions; some examples:
    • at no point in execution can the operand stack grow to more than max_stack item
    • and also no more values can be popped from operand stack than it contains
    • a type of every value stored into an array by aastore must be a reference type

For a full list of constraints see constraints section in the JVM Specification.

Verification of class Files

Although JVM compiler will produce code that meets all these constraints, it is posible that JVM runtime will try to load maliciously (or accidently) modified bytecode. Therefore, at linking time, a verification takes place. It is a set of expensive checks but it is done once (and doesn’t need to be repeated for each intterpreted instruction at runtime).

During runtime the JVM will therefore know that:

  • there are no operand stack overflows/underflows
  • all local variable uses and strores are valid
  • all arguments to all instructions are of valid types

There are two strategies used for verification:

  • by type checking (for .class files with versions 50.0 or higher)
  • by type inference (does not need to be supported for Java ME and Java Card) for versions less than 50.0

Those strategies enforce static and structural constraints on the Code attribute and

  • ensure that final classes are not subclassed
  • ensure that final methods are not overriden
  • check that every class (except Object) has a direct superclass

For details of each strategy of verification see the links above.

Danger of falling asleep
A strong coffee ☕ will be necessary, otherwise you’ll fall asleep somewhere around verification types in TypeChecking strategy.

Verification is a very important part of the loading, linking and initializing process. This process is described in detail in Chapter 5 of the JVM Specification.

Limitations of the JVM

Some values in class file format that denote the count of something are stored as 16-bit value. And this creates a limit on such count to be at most 65535 values (max unsigned value that fits into 16 bits):

  • constant_poll_count item on ClassFile structure is a 16-bit value, which means that constant_pool is limited to 65535 entries; the spec nicely comments:

    This acts as an internal limit on the total complexity of a single class or interface.

Similarly:

  • number of fields declared by a class/intface (i.e. not inherited) is limited to 65535 due to the size of fields_count item
  • number of methods declared by a class/interface is also limited to 65535 due to the size of methods_count item
  • number of direct superinterfaces (interfaces_count): 65535
  • number of local variables array of a frame (max_locals item in Code attribute): 65535 (with long and double values contributing 2 elemns for that count making it even smaller)
  • size of operand stack (max_stack in Code attr; with same issue with long and double as above): 65535
  • length of field and method names, field and method descriptors and other constant string values (length item is 16 bit in CONSTANT_Utf8_info structure): 65535
  • number of dimensions in arrays: 255 (due to size of dimensions opcode of multinewarray instruction)

Summary

Three last sections of chapter 4 of JVM Specifications describe:

  • constraints that are imposed on the Code attributes of .class file such that the file can be considered well-formed;
  • then there are two algorithms for type checking described:
    • the old one (for classes with versions below 50.0) based on type inferece and described in human language,
    • and the newer one (for class versions 50.0 or above) described using Prolog predicates
  • the last section lists the limitations for number of fields, methods, size of operand stack or dimensionality of arrays that are the result of using 16-bit values for specifying counts in internal stuctures of a .class file format.

My next blog post would cover loading, linking and initializing of classes in the JVM.

Stay tuned!


Ten wpis jest częścią serii jvm.

Wszystkie wpisy w tej serii: