Currently, we are just scaling off of CPU. Still, we were wondering if there are any official recommendations on what metrics to best scale the various deployments of the temporal server.
task_schedule_to_start_latency would be a reasonable candidate.
task_schedule_to_start_latency is useful for scaling temporal workers. But you have to look at other 3 servers as well (history, frontend and matching) servers. Scaling those servers should be based on CPU and memory like you would scale any horizontally scaled service