I was going through some of the samples and some questions in the forum regarding Sync and Async.
Few thoughts - if workflow is calling an activity using Async Function and it will return a Promise. When I am doing #get it is waiting to complete the operation. May be my understanding is not proper - If this workers is getting a huge number of work request - whether the blocking of Promise#get won’t be a problem? How it manages the thread? Whether it has any eventloop thread concept like Netty?
TLDR; Blocking a workflow thread doesn’t consume a thread at the worker process. So you can have hundreds of millions of blocked workflows and process them with workers that have hundreds of threads in their thread pools.
Workflow executes as a series of workflow tasks. While workflow code executes, all the external requests it makes (like scheduling activities, child workflows, and timers) are not sent directly to the service but are accumulated at the workflow worker as commands. Only when all workflow code threads are completely blocked, the workflow task is declared completed by sending all the accumulated commands to the service.
After the workflow task completes, the full state of the workflow, which includes all the threads it is using can be released back to the process.
Later, when some event (like activity completion or timer firing) happens, the service schedules a new workflow task. A workflow worker (possibly a different than the original one) picks it up from the workflow task queue. Restores the workflow state to were it was when the previous workflow task was completed and delivers the new event to the workflow. The new event unblocks some of the workflow threads causing new requests, which in turn become commands, and the whole process repeats.
Note that workflow consumes threads only when it is making progress. And each such workflow task is usually very short in the order of milliseconds. This allows having a practically unlimited number of blocked workflows with a limited number of workflow worker threads.
Recreating the state of a workflow from scratch on every workflow task is pretty resource-intensive. So as optimization workflows are cached at workflow workers. A cached workflow still holds all the threads its code is blocked on. That’s why if you take a thread dump of a workflow worker, you will see a bunch of blocked workflow threads.
Every time a workflow that executes its workflow task needs to create a thread it gets it from an internal thread pool. If the pool doesn’t have any free threads, some workflow is kicked out of the cache, and its resources are released. The released threads are returned back to the thread pool, and the new thread creation succeeds.
One consequence of this design is that the number of cached workflows in Java is limited, not by memory, but the number of threads the process can sustain.
Thanks for the detailed answer
I was re-reading the reply agin for a curiosity. So our temporal SDK managing the thread like Akka actor or Loom - fibers ? Rather than writing Async and making it complex these frame work handling thread in a different way.
It does manage threads. Otherwise it wouldn’t be possible to execute multithreaded application deterministically. It is not as sophisticated as Loom as it is implemented as a library which runs on any JVM.