Recommended metrics to use for autoscaling temporal server pods

Currently, we are just scaling off of CPU. Still, we were wondering if there are any official recommendations on what metrics to best scale the various deployments of the temporal server.

Seems like task_schedule_to_start_latency would be a reasonable candidate.

task_schedule_to_start_latency is useful for scaling temporal workers. But you have to look at other 3 servers as well (history, frontend and matching) servers. Scaling those servers should be based on CPU and memory like you would scale any horizontally scaled service

1 Like