What are recommended values for workflowLocal/workflow/activity poll thread counts?

Hello,

In Java-SDK there are following settings:

  1. WorkerFactoryOptions.setWorkflowHostLocalPollThreadCount()
  2. WorkerOptions.setWorkflowPollThreadCount()
  3. WorkerOptions.setActivityPollThreadCount()

In my test setup I created 8 workers and set these to 4, 4, 8 respectively. It was enough to overwhelm local machine GRPC client and make any calls such as “WorkflowClient.start” non-functional when running in the same process. Note: in this setup GRPC Managed Channel was reused across all of these threads.

Could you please provide guidance on what these settings really mean and what are scenarios when any of settings in 1,2,3 should be set above “one”, if ever?


Thanks,
Artem

1 Like

These settings define how many threads poll for tasks from the correspondent task queues. The reason to increase them is when there is backlog accumulating in a task queue and worker is not fully utilized.

It is not that hard to overwhelm a local docker based service as it is not provisioned for any performance testing. Try it with a real deployment to see if you can fully load your workers.

1 Like

Thanks, I did more digging. Apparently what happened in my local test - gRPC ManagedChannel that is shared across all Stubs was overwhelmed. Once I crossed 80-100 threads, connectivity died. When I created 2 factories with 80 threads each it worked fine. I’m not sure what’s causing this - grpc-java ManagedChannel implementation being slow or the fact that Temporal framework uses blocking stubs exclusively that creates more contention on the ManagedChannel itself compared to async future stubs.
Note: this happened with no load on the system, so caused purely by the volume of polling / threads per gRPC ManagedChannel.

2 Likes

I’ve noticed something similar. Once the number of pollers increases on a single factory, looks like it kills performance in general.

Every poller occupies and blocks a separate thread. These threads also have pretty frequent context switches, because they are getting unblocked pretty uniformly. This may have an effect on the system on a larger amount of pollers. There is also a relevant task to rework all pollers to use just one thread and futureStub instead of blockingStub: Pollers should use just one thread to perform async gRPC requests · Issue #1456 · temporalio/sdk-java · GitHub

kills performance in general

it’s not something actionable on our side, but if make a performance profiling and find something specific, it may be useful.

Speaking about gRPC ManagedChannel performance - it’s something that is abstracted from us by gRPC implementation, but we may take a look at this at some moment.