Contents

Virtual Threads in Javia 21

A bit of history

Java has allowed concurrent programming since version 1.0 through the use of the Thread class.

This capability was introduced in 1996 (I celebrated my 18th birthday that year!), but it quickly became evident that it wasn’t straightforward. Larger and more complex applications required better tools to handle threads.

The standard Java library lacked essential constructs for modeling certain concurrent programming patterns, and the code was often hard to read and prone to errors.

./java-concurrent.png

Then, eight years later, in 2004, Java SE 5.0 was released, and it brought new classes in java.util.concurrent (concurrency utilities overview). Although these changes were overshadowed by generics, enums, varargs, and annotations, they significantly improved concurrent programming:

  • The Executor framework allowed us to queue tasks (Runnable) in a scheduler, eliminating the need to manually start threads.
  • New efficient implementations of Map, List, and Queue interfaces for concurrent programming (interfaces Queue and BlockingQueue were introduced in this version) were introduced
  • Thread synchronization tools such as mutexes, semaphores, barriers, and latches were added.
  • Lock implementations in java.util.concurrent.lock improved the synchronization using the synchronized {} mechanism, adding features like defining timeouts for lock waiting, multiple conditional variables per lock, and interrupting waiting threads.

Ten years later, in 2014, Java 8 was released.

It introduced lambdas, which somewhat overshadowed the new concurrency management mechanism called fork-join. Fork-join allowed multiple lists of threads waiting for tasks to be executed by an executor, enabling an algorithm called “work-stealing” where threads can “steal” tasks from other lists. Additionally, Java 8 significantly extended the standard library with the Completable Future API, attempting to create an asynchronous programming model in Java. However, it didn’t gain as much popularity among developers compared to the async/await model in many other programming languages.

And now, after another ten years, the Loom project comes on stage, bringing another upgrade: virtual threads, structured concurrency, and scoped values.

Examples and demos of virtual threads (by Hose Paumard) can be found in the demo repository.

What is a thread?

Until now, Java threads were merely thin wrappers around operating system threads (OS threads or kernel threads). OS threads - called pthreads or platform threads - are quite resource-intensive:

  • Starting a thread takes about 1 millisecond.
  • Each thread requires memory allocation (around 2MB) to hold its stack.
  • Switching between threads is a heavyweight operation (context switch) managed by the operating system scheduler (see context switch).

The weakness of the thread-per-request model

Because of these limitations, “commodity hardware” is somewhat restricted when it comes to creating platform threads. On heavily loaded application servers that follow the “one thread per request” model, reaching hundreds of thousands of requests could lead to memory exhaustion, causing the server to crash. On the other hand, using thread pools (limiting the number of threads) might lead to performance issues due to insufficient processing power to handle incoming connections/requests.

The JVM world attempted to address this problem with approaches like actor-based programming (Akka in Scala and Java), Coroutines in Kotlin, Vert.x, and even in plain Java. We had…

Not perfect Completable Future

Asynchronous programming model in Java with CompletableFuture is not perfect (supplyAsync…, andThenApply… andThenApply…):

  • it is very hard to read
  • it is difficult to draw conclusions about the course of control
  • it is not known in which thread Runnables are executed (debugging and logging is very difficult)
  • it is difficult to unit test

So, again, after next ten years there are finally virtual threads implementeded in core Java:

Virtual threads and platform threads

Platform threads

  • Platform threads map 1-to-1 to kernel threads, which are queued by the operating system scheduler.
  • They have a large stack size and require resources managed by the OS.
  • While they are suitable for many tasks, they are expensive resources.
  • Platform threads have default names and can be daemon or non-daemon threads. The main thread in which main() method runs is the main non-daemon thread. The JVM starts the shutdown sequence once all non-daemon threads have finished.

Virtual threads

  • Virtual threads, on the other hand, are user-mode threads managed by the JVM rather than the operating system scheduler.
  • They are mapped in a v:p proportion (v > p) to system threads. Multiple virtual threads can map to a single platform thread.
  • Virtual threads are lightweight threads managed by a special fork-join pool controlled by the JVM. A single platform thread can execute multiple virtual threads simultaneously.
  • When a virtual thread executing on a specific platform thread starts a blocking I/O operation, its “state” (data) may be moved to the heap. Once the kernel completes the I/O operation, the JVM will restore the virtual thread from the heap onto a potentially different platform thread.
  • The currentThread() method returns information about the virtual thread, and when we create a virtual thread, we don’t have access to the platform thread information.
  • Virtual threads don’t have default names, and if we don’t set one, getName() will return an empty string.
  • They have a fixed and unchangeable priority.
  • They are daemon threads and don’t block JVM shutdown.

./vthreads.png

Creating and running threads

To create and run threads, the developer writes code in the traditional way: by creating Runnables and running them using threads to perform long-running operations without blocking the main application thread. The new Java 21 provides two methods for this purpose. The first one creates platform threads directly, and the second one creates virtual threads. The class that actually creates threads is a thread builder

1
2
3
  Thread thread = Thread.ofPlatform().daemon().start(runnable);
  Thread thread = Thread.ofPlatform().name("duke").unstarted(runnable);
  ThreadFactory factory = Thread.ofPlatform().daemon().name("worker-", 0).factory();
  • Thread.ofVirtual() - creates a builder for virtual threads or a builder for creating vurtual thread factory:
1
2
  Thread thread = Thread.ofVirtual().start(runnable);
  ThreadFactory factory = Thread.ofVirtual().factory();

Simple test program

This example is an extension of: JosePaumard’s Loom demo.

The MaxThreads.java starts a number of virtual (virt) or platform (plat) threads; the number is passed as commandline parameter; orders each thread to sleep for two seconds, waits untill the threds complete and prints out the time it took for them to finish:

Here’s how I’d start ten platform threads:

java --source 21 --enable-preview MaxThreads.java plat 10

And here is how I’d start fifteen virtyal thereads:

java --source 21 --enable-preview MaxThreads.java virt 15

The program:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
import java.time.Duration;
import java.time.Instant;
import java.util.Set;
import java.util.Arrays;
import java.util.concurrent.ConcurrentHashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.IntStream;

public class MaxThreads {

    // --enable-preview

    private static final void exitWithMessage() {
          System.out.println("Expected argument: virt or plat");
          System.exit(1);
    }
    public static void main(String[] args) throws InterruptedException {
        System.out.println(Arrays.toString(args));
        if (args.length != 2) {
          exitWithMessage();
        }

        var builder = switch(args[0]) {
          case "virt" -> Thread.ofVirtual();
          case "plat" -> Thread.ofPlatform();
          default -> null;
        };

        var count = Integer.parseInt(args[1]);
        
        if (builder == null) {
          exitWithMessage();
        }
        // virtual thread
        var threads =
              IntStream.range(0, count)
                    .mapToObj(index ->
                          builder
                                .name("platform-", index)
                                .unstarted(() -> {
                                    try {
                                        Thread.sleep(2_000);
                                    } catch (InterruptedException e) {
                                        throw new RuntimeException(e);
                                    }
                                }))
                    .toList();

        Instant begin = Instant.now();
        threads.forEach(Thread::start);
        for (Thread thread : threads) {
            thread.join();
        }
        Instant end = Instant.now();
        System.out.println("Duration = " + Duration.between(begin, end));
    }
}

Platform threads - running the program

I’ll try to run the above program in a loop, each time increasing the number of threads by an order of magnitute (i.e. ten times). I start with platform threads and iterate from ten to a milion of threads:

JVM on my machine didn’t make it to handle ten thousands of platform threads:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
➜  jvm for i in {1..6}; do java --source 21  --enable-preview MaxThreads.java plat $(( 10 ** $i )); done > out.txt 2>&1
➜  jvm cat out.txt
───────┬──────────────────────────
       │ File: out.txt
───────┼─────────────────────────
   1[plat, 10]
   2Duration = PT2.00168193S
   3[plat, 100]
   4Duration = PT2.009612897S
   5[plat, 1000]
   6Duration = PT2.056957326S
   7[plat, 10000]
   8[1,604s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
   9[1,604s][warning][os,thread] Failed to start the native thread for java.lang.Thread "platform-8979"
  10   │ Exception in thread "main" java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
  11   │     at java.base/java.lang.Thread.start0(Native Method)
  12   │     at java.base/java.lang.Thread.start(Thread.java:1526)
  13   │     at java.base/java.lang.Iterable.forEach(Iterable.java:75)
  14   │     at MaxThreads.main(MaxThreads.java:51)
  15[plat, 100000]
  16[2,080s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
  17[2,080s][warning][os,thread] Failed to start the native thread for java.lang.Thread "platform-8979"
  18   │ Exception in thread "main" java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
  19   │     at java.base/java.lang.Thread.start0(Native Method)
  20   │     at java.base/java.lang.Thread.start(Thread.java:1526)
  21   │     at java.base/java.lang.Iterable.forEach(Iterable.java:75)
  22   │     at MaxThreads.main(MaxThreads.java:51)
  23[plat, 1000000]
  24[4,175s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
  25[4,175s][warning][os,thread] Failed to start the native thread for java.lang.Thread "platform-8976"
  26   │ Exception in thread "main" java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
  27   │     at java.base/java.lang.Thread.start0(Native Method)
  28   │     at java.base/java.lang.Thread.start(Thread.java:1526)
  29   │     at java.base/java.lang.Iterable.forEach(Iterable.java:75)
  30   │     at MaxThreads.main(MaxThreads.java:51)
───────┴────────────────────────
➜  jvm

Virtual threads - running the program

How are virtual threads doing? ** The’re doing great!** :)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
➜  jvm for i in {1..6}; do java --source 21  --enable-preview MaxThreads.java virt $(( 10 ** $i )); done > out_virt.txt  2>&1
➜  jvm cat out_virt.txt
───────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ File: out_virt.txt
───────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1[virt, 10]
   2Duration = PT2.005876979S
   3[virt, 100]
   4Duration = PT2.00849789S
   5[virt, 1000]
   6Duration = PT2.025809059S
   7[virt, 10000]
   8Duration = PT2.138172909S
   9[virt, 100000]
  10Duration = PT2.785108472S
  11[virt, 1000000]
  12Duration = PT11.038094874S
───────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
➜  jvm

Inside

Until now, the only possible state of a thread (except from “running”) was that it could be interrupted (infamous InterruptedException), could complete normally or could throw an exception. Now something elme may happen: thread can be “parked” - e.g. when theoperating system is waiting for some I/O operation to complete - and after a while it will be “unparked”, that means, restored to a tate where it can continiue with its coputation with the received data. How is this possible?

If you look deep into Thread class implementation, in particular into sleep method, you’ll see a Continuation class which is an internal wrapper aroud a platform thread. It has an interesting yieldContinuation method which is responsible for giving away the control to the platform thread. If we enable access to interanl java module (see openjdk mailing list thread): java --add-opens java.base/jdk.internal.vm=ALL-UNNAMED <your-main-class> then we can use ContinuationScope class ourselves in such a way that - while still being inside a “Continuation” - we’ll “park” current task, and the rest of the code won’t be executed. If we start continuation again, it will pick up where it left and continue execution till the end. This is not a public API (i and it is not yet desided if it will ever be), but itis quite interesting to see what’s possible.

This code shows how to steet thread execution using Continuation: G3_Continuation_Yield.

Structured concurency

With virtual threads available, developers can try out structured concurrency, which allows for writing imperative code without using CompletableFuture. [StructuredTaskScope](https://download.java.net/java/early_access/jdk21/docs/api/java.base/java/util/concurrent/StructuredTaskScope.html] enables the following features:

  • Creation of multiple virtual threads (fork())
  • Definition of when the scope should be closed (on success, on failure)
  • Clear visibility of dependencies between threads in thread dumps
  • The ability to cancel the parent thread, causing all the child threads to be canceled as well
  • No need to explicitly declare an ExecutorService, which might cause confusion about when to shut it down

For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
    Callable<String> task1 = ...
    Callable<Integer> task1 = ...

    try (var scope = new StructuredTaskScope<Object>()) {

        Subtask<String> subtask1 = scope.fork(task1);
        Subtask<Integer> subtask2 = scope.fork(task2);

        scope.join();

        ... process results/exceptions ...

    } // close

Scoped value

The traditional ThreadLocal has been replaced within the StructuredTaskScope by the ScopedValue mechanism.

Like ThreadLocal, ScopedValue allows sharing values across child threads without passing them as parameters. However, a ScopedValue is tied to and visible only within a dynamically scoped region (“scope”), and it’s not accessible outside of it. It can be bound to different values, and child threads will see the new value.

Sample usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
  private static final ScopedValue<String> NAME = ScopedValue.newInstance();

    ScopedValue.runWhere(NAME, "duke", () -> {
        try (var scope = new StructuredTaskScope<String>()) {

            scope.fork(() -> childTask1());
            scope.fork(() -> childTask2());
            scope.fork(() -> childTask3());

            ...
         }
    });

You can read about ScopedValues in the blog post about ScopedValues and ThreadLocals.

Preparations in the ecosystem

Interestingly, even Spring Boot has prepared itself to run request handling on virtual threads using its internal Tomcat. You can implement and expose as a @Bean a TomcatProtocolHandlerCustomizer class that defines callables for setting the appropriate executor used in synchronous requests (return handler -> handler.setExecutor(Executors.newVirtualThreadPoolPerTaskExecutor())). For more details and a demo, you can refer to the excellent presentation by Jose Paumard: “Virtual Threads and Structured Concurrency in Java 21 With Loom,” which inspired and provided the code for this post.

For more details I highly recommend the video with JobePaumard (of JavaCafe) which inspired me to check and play with his demo code: JosePaumard: Virtual Threads and Structured Concurrency in Java 21 With Loom and to write this blogpost.

The evening is comming, so it’s time to take care of the house and kids.

Happy coding!