Throughput not improving

amlanroy1980 · October 11, 2022, 8:01pm

During stress testing I am unable to find any obvious bottleneck. CPU usage in workers and Temporal servers are less than 50%. For Cassandra servers, it is less than 30%. I have tried changing different configurations I am aware of, but the throughput is always the same. I tried the following configurations:

numHistoryShards: 512 and 1024
matching.numTaskqueueReadPartitions: 16

WorkerFactoryOptions.setMaxWorkflowThreadCount: 600, 1200
WorkerFactoryOptions.setWorkflowCacheSize: 600, 1200
WorkerFactoryOptions.setWorkflowHostLocalPollThreadCount: default, 20, 40

WorkerOptions.setMaxConcurrentActivityExecutionSize: 200, 400
WorkerOptions.setMaxConcurrentWorkflowTaskExecutionSize: 200, 400
WorkerOptions.setMaxConcurrentLocalActivityExecutionSize: 200, 400
WorkerOptions.setMaxConcurrentActivityTaskPollers: default, 20, 40
WorkerOptions.setMaxConcurrentWorkflowTaskPollers: default, 20, 40

Please let me know if I am missing anything.

Also, is there any command I can execute to see the numHistoryShards in the cluster?

amlanroy1980 · October 12, 2022, 7:40pm

I saw improvement in throughput when I used 2 instances of Frontend, each with 2 core, 4GB RAM. Previously I was using 1 instance with 4 core, 8GB RAM. I made same change for Matching also. I kept the History server unchanged, 2 instances, each with 4 core, 8GB RAM.

I am assuming Frontend was the bottleneck since none of the other changes I have been trying so far made any difference.

Do we get better throughput when we use more number of smaller Frontend servers? For example, should I use 4 instances of Frontend, each with 1 core, 2GB RAM?
How do we detect such bottlenecks since CPU usage of Frontend was less than 50%?
Should I try similar changes for History and Matching servers to achieve better throughput with same hardware?

@maxim @tihomir

tihomir · October 31, 2022, 1:28pm

Sorry for the late reply, can you update this post to the latest info you have so we can take a look?

matching.numTaskqueueReadPartitions: 16

Do you see any improvements by lowering this to 8?

Also, is there any command I can execute to see the numHistoryShards in the cluster?

Yes you could use tctl, for example:

tctl adm cl d | jq .historyShardCount

Also can you check your frontend service metrics and look for any service errors:

sum(rate(service_error_with_type{service_type="frontend"}[5m])) by (error_type)

Topic		Replies	Views
Temporal throughput Community Support general-impl , best-practices	16	5659	January 20, 2025
Temporal throughput not improving Community Support cassandra , metrics	2	1112	October 2, 2022
Suggestions to increase worker throughput Community Support	7	2020	December 10, 2020
Bottleneck at scaling Temporal server Community Support mysql , performance	1	114	March 11, 2025
Need for help: Recommended Cluster Configuration in production Community Support	2	459	November 1, 2022

Throughput not improving

Related topics