Tuning Temporal setup for better performance

spikhalskiy · November 12, 2021, 7:32am

Main parameters to tune

Defining pooling the tasks from the server:

WorkerOptions#workflowPollThreadCount

WorkerOptions#activityPollThreadCount

WorkerOptions#maxConcurrentWorkflowTaskExecutionSize

WorkerOptions#maxConcurrentActivityExecutionSize

Defining the in-memory cached workflows state and threads:

WorkerFactoryOptions#maxWorkflowThreadCount

WorkerFactoryOptions#workflowCacheSize

Some reasonable limitations for these values

WorkerFactoryOptions#workflowCacheSize ≤ WorkerFactoryOptions#maxWorkflowThreadCount. Having a cache larger than the size of the thread pool doesn’t make much sense.
WorkerOptions#maxConcurrentWorkflowTaskExecutionSize ≤ WorkerFactoryOptions#maxWorkflowThreadCount. maxWorkflowThreadCount should be ideally at least 2x of maxConcurrentWorkflowTaskExecutionSize to be safe, but it depends on how actively an app uses threads and also how active the workflows are. Having maxConcurrentWorkflowTaskExecutionSize > maxWorkflowThreadCount doesn’t make sense at all and it’s a misconfiguration because for each workflow only one Workflow Task can be processed at a single moment of the time anyway.

The desired poller count

depends on

The latency from the worker to the service. Higher latency lowers the throughput of a single poller thread.
If you see that worker threads are not getting loaded with enough job, at the same time schedule_to_start latencies are high, you can try increasing the poller count to 5-10 pollers.
How big are the workflow tasks (how large is the average amount of time between consecutive blockage of workflow execution on activities or waiting/sleeping)? Smaller workflow tasks your application has, larger pollers/executors ratio you need.

The desired executors count

depends on the utilization of resources of your worker. If the worker is underutilized, increase maxConcurrent*ExecutionSize.

Workflow tasks

It doesn’t make much sense to set WorkerOptions#maxConcurrentWorkflowTaskExecutionSize value too high. Because Workflow code shouldn’t have blocking operations and waits [other than the ones provided by Workflow class], Workflow Tasks should occupy a full core during execution. This means that it doesn’t make much sense to set WorkerOptions#maxConcurrentWorkflowTaskExecutionSize into something much higher than the amount of the cores.

Activities

WorkerOptions#maxConcurrentActivityExecutionSize should be set looking into the profile of your Activities. If the Activities are mostly computational, it doesn’t make much sense to set it into something larger than a number of available cores. But if Activities perform mostly input-output awaiting RPC calls, it makes sense to increase this number by a lot.

Drawbacks of putting “very large values”

As with any multithreading system, specifying too large values will lead to too many active threads performing work and constant resources stealing and switching, which will decrease the total throughput and latency jitter of the system.

Topic		Replies	Views
Temporal throughput Community Support general-impl , best-practices	16	5717	January 20, 2025
Temporal seems to hit scale wall Community Support performance	6	3429	March 29, 2024
Temporal performance with golang microservice, Cassandra & Elasticsearch Community Support go-sdk , elasticsearch , cassandra , docker , performance	14	3437	February 1, 2023
Suggestions to increase worker throughput Community Support	7	2038	December 10, 2020
Workflow Performance with Java SDK Community Support java-sdk	1	754	February 20, 2023