Autoscaling Workers Based on Custom Prom Metrics, For one specific activity in the queue

Rajendra · March 21, 2024, 7:17am

Hey Guys, Currently, I am trying to deploy my workers in autoscaling mode. But autoscaling based on resource usage is not that helpful.
In my workflow, I have multiple activities, of which only one (let’s call it the BigActivity) takes a long time and is resource-intensive. So I want to scale my workers based on the number of BigActivity activities in the queue(1 BigActivity vs 1 Worker). But in prom metrics, I don’t see any metrics related to this, or may be schedule latency for a specific activity type.

Part 2: I am also planning to run this activity in separate workers using a seperate queue, but question remains same, on which metric to auto-scale.

tihomir · March 21, 2024, 3:40pm

Yes would separate this BigActivity on its own task queue and have dedicated activity workers for it.
One thing you can consider scaling on is activity worker task slots available -
Temporal SDK metrics reference | Temporal Documentation for worker_type=ActivityWorker and filter on this activities task queue.
If this metrics depletes or goes to 0 meaning all your workers are processing max configured activity tasks could mean you need to scale up if you want to process more.
CPU on those activity workers can also be a good indication in combination to this to scale on.

If you have service metrics available one more indication of task backlog is

sum(rate(persistence_requests{operation=“CreateTask”}[1m]))

meaning CreateTask would be recorded only when a task is not dispatched to your worker right away but had to be persisted as no pollers were availabe at that time to dispatch to.

Topic		Replies	Views
Suggested metrics to autoscale Temporal workers on Community Support general-impl , metrics , kubernetes	9	8059	January 3, 2024
Scaling Strategy For Workers With Rate-Limited Task Queues Community Support java-sdk , general-impl , activity , metrics	1	53	April 20, 2025
Auto scaling worker deployment Community Support python-sdk , scaling , deployment	9	2829	April 4, 2024
Which metric should be used for HPA activity workers in Kubernetes? Community Support php-sdk	2	1301	February 19, 2022
Scaling temporal worker Community Support	2	450	March 30, 2025

Autoscaling Workers Based on Custom Prom Metrics, For one specific activity in the queue

Related topics