Temporal performance with golang microservice, Cassandra & Elasticsearch

tihomir · May 25, 2022, 12:03am

Is there any more configuration that can be configured to improve the throughput?

One thing you can check is your persistence latencies. Try Prometheus query:

histogram_quantile(0.95, sum(rate(persistence_latency_bucket{}[1m])) by (operation, le))
for CreateWorkflowExecution, UpdateWorkflowExecution, UpdateShard operations. You will have to establish your own base line for these latencies, but is a good thing to look at.

Another thing to consider is worker capacity (number of pollers).
See this post and our worker tuning guide for more info.

Regarding shard count, try looking at shard lock contention:

histogram_quantile(0.99, sum(rate(lock_latency_bucket{operation="ShardInfo"}[1m])) by (le))

if this latency is too high indicates that your did not set enough shards.

How many frontend servers do you have handling client requests? Typically you want to have a L7 load balancer to round-robin client load among available frontend servers.

Another thing you could look at are resource exhausted issues.
Prometheus query:
sum(rate(service_errors_resource_exhausted{}[1m])) by (resource_exhausted_cause)

Look for error causes: RpsLimit, ConcurrentLimit, and SystemOverloaded. These are per service instance.

Topic		Replies	Views
Temporal Performance with golang microservices Community Support go-sdk , mysql , cassandra	9	1769	August 7, 2022
Tuning Temporal setup for better performance Community Support cassandra , performance , kubernetes	5	8859	November 13, 2021
Settings / Recommendations for orchestrating microservices with temporal and MySQL Community Support go-sdk , mysql , helm , performance	10	2895	February 23, 2021
Temporal throughput not improving Community Support cassandra , metrics	2	1112	October 2, 2022
Workflow Performance with Java SDK Community Support java-sdk	1	744	February 20, 2023

Temporal performance with golang microservice, Cassandra & Elasticsearch

Related topics