What are recommended values for workflowLocal/workflow/activity poll thread counts?

artem · August 6, 2020, 2:38am

Hello,

In Java-SDK there are following settings:

WorkerFactoryOptions.setWorkflowHostLocalPollThreadCount()
WorkerOptions.setWorkflowPollThreadCount()
WorkerOptions.setActivityPollThreadCount()

In my test setup I created 8 workers and set these to 4, 4, 8 respectively. It was enough to overwhelm local machine GRPC client and make any calls such as “WorkflowClient.start” non-functional when running in the same process. Note: in this setup GRPC Managed Channel was reused across all of these threads.

Could you please provide guidance on what these settings really mean and what are scenarios when any of settings in 1,2,3 should be set above “one”, if ever?

–
Thanks,
Artem

maxim · August 6, 2020, 2:56pm

These settings define how many threads poll for tasks from the correspondent task queues. The reason to increase them is when there is backlog accumulating in a task queue and worker is not fully utilized.

It is not that hard to overwhelm a local docker based service as it is not provisioned for any performance testing. Try it with a real deployment to see if you can fully load your workers.

artem · August 6, 2020, 5:26pm

Thanks, I did more digging. Apparently what happened in my local test - gRPC ManagedChannel that is shared across all Stubs was overwhelmed. Once I crossed 80-100 threads, connectivity died. When I created 2 factories with 80 threads each it worked fine. I’m not sure what’s causing this - grpc-java ManagedChannel implementation being slow or the fact that Temporal framework uses blocking stubs exclusively that creates more contention on the ManagedChannel itself compared to async future stubs.
Note: this happened with no load on the system, so caused purely by the volume of polling / threads per gRPC ManagedChannel.

Hrishikesh_Chappadi · March 22, 2023, 4:59am

I’ve noticed something similar. Once the number of pollers increases on a single factory, looks like it kills performance in general.

spikhalskiy · March 22, 2023, 3:09pm

Every poller occupies and blocks a separate thread. These threads also have pretty frequent context switches, because they are getting unblocked pretty uniformly. This may have an effect on the system on a larger amount of pollers. There is also a relevant task to rework all pollers to use just one thread and futureStub instead of blockingStub: Pollers should use just one thread to perform async gRPC requests · Issue #1456 · temporalio/sdk-java · GitHub

kills performance in general

it’s not something actionable on our side, but if make a performance profiling and find something specific, it may be useful.

Speaking about gRPC ManagedChannel performance - it’s something that is abstracted from us by gRPC implementation, but we may take a look at this at some moment.

Topic		Replies	Views
What are the recommended settings for workflow and activity pollers count? Developer Corner general-impl	0	4476	August 8, 2022
How to add new PollerOption to workflow? Community Support java-sdk	6	1432	January 3, 2023
Workflow Performance with Java SDK Community Support java-sdk	1	725	February 20, 2023
Clarification on worker properties in Java SDK Community Support java-sdk	7	726	June 9, 2022
How To Identify And Tune Worker Bottlenecks Community Support java-sdk	2	1832	January 23, 2023

What are recommended values for workflowLocal/workflow/activity poll thread counts?

Related topics