Contents

JVM: Names and Descriptors

This post is a note taken during reading 4.2 and 4.3 sections of chapter 4.

Section 4.2 describes the rules of representing class and interface names, field and method names, module and package names.

Section 4.3 describes descriptors and provides a short grammar according to which a representation of a type is constructed.

Names

Binary class and interface names

  • those names always appear in a fully qualified form known as binary name (see JLS 13.1)
  • they are represented with forward slash instead of a dot (for historical reasons)
  • represented as CONSTANT_Utf8_info structures
  • referenced from CONSTANT_NameAndType_info and from CONSTANT_Class_info structures

Field and method names

Unqualified names are used to strore:

  • fields
  • methods
  • local variables
  • formal parameters

They:

  • must contain at least one unicode codepoint and must not contain . ; [ / (dot, semicolon, left square or forward slash)
  • and method names - except from special names <init> and <clinit> - must not contain < or >

Module names

Module names:

  • are stored in CONSTANT_Module_info structure in Module attribute of the constant pool. The stucture wraps CONSTANT_Utf8_info.
  • don’t have dots replaced by slashes
  • cannot contain codepoints on range ‘\u0000’ to ‘\u001F’
  • \ is reserved as escape character; cannot be used in module name unless followed by another \, at-sign (@) or colon (:)
  • colon and at-sign are reserved in module names so they can be part of the name only if escaped using backslash

Package names

Package names referenced from modules are stored in CONSTANT_Package_info structure which wraps CONSTANT_Utf8_info structure containig package name in internal form (i.e. with forward slashes)

Descriptors

Descriptor is a string representing a type of a field or method. The grammar describing descriptors (with terminal symbols marked like this and the meaning of field type is given in a table below.

Field Descriptors

Descriptor = FieldDescriptor | MethodDescriptor
FieldDescriptor = FieldType
FieldType = BaseType | ObjectType | ArrayType
BaseType = B | C | D | F | I | J | S | Z
ObjectType = [ ClassName ;
ArayType = [ ComponentType
ComponentType = FieldType

Type field interpretation

FieldType term Type Interpretation
B byte signed byte
C char Unicode character code point in the Basic Multilingual Plane, encoded with UTF-16
D double double-precision floating-point value
F float single-precision floating-point value
I int integer
J long long integer
L ClassName ; reference an instance of class ClassName
S short signed short
Z boolean true or false
[ reference one array dimension

Example (Fields Descriptors)

  • Field with type “int” is denoted as I.
  • Field with type “array of long” is denoted as [J
  • Field with type “Object” is denoted as Ljava/lang/Object;
  • Field with type “array of chars is denoted as [C

Method Descriptors

MethodDescriptor = ( ParameterDescriptor* ) ReturnDescriptor
ParameterDescriptor = FieldType
ReturnDescriptor = FieldType | VoidDescriptor
VoidDescriptor = V

Example (Method Descriptors)

  • method void main(String[]): ([Ljava/lang/String;)V
  • method Object m(int i, double d, Thread t): (IDLjava/lang/Thread;)Ljava/lang/Object;

Validity of method descriptor

The spec says that for the method descriptor to be considered valid, the length of parameters should not exceed 255, and each long or double parameter contributes 2 unit and any other type contributes 1 unit to the total length (in case of intrerface and instance method there is “this” which is also counted).

Examples from .class file

The ability to read descriptors is helpful while reading .class file output from javap command.

Have a look at this snippet:

1
2
3
4
56: invokestatic  #106                // Method java/lang/Byte.valueOf:(B)Ljava/lang/Byte;
59: aastore
60: invokestatic  #112                // Method java/lang/String.format:(Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/String;
63: invokevirtual #116                // Method java/io/PrintStream.print:(Ljava/lang/String;)V

Here we have three calls: two calls to static method:

  • Byte Byte.valueOf(byte b) represented as (B)Ljava/lang/Byte;
  • String String.format(String s, Object.. args) represented as (Ljava/lang/String;[Ljava/lang/Object;)Ljava/lang/String;
  • and one call to interface method void PrintStream.print(String s) represented as (Ljava/lang/String;)V

Summary

This was a short post - just a warmup before a jump into Section 4.4 The Constant Pool which is a longer read and together with three chapters that follow (Fields, Methods and Attributes) give whole perspective on all structures used in .class file.

Then follow sections on contraints (4.9), verification of class files (4.10) and limitations of JVM (4.11). After that my knowledge about .classfile structure would be complete 😂


Ten wpis jest częścią serii jvm.

Wszystkie wpisy w tej serii: