Activity start schedule latency value is extremely large

Hi there,

I start deploying monitor and autoscale for our temporal services, but I noticed avg(temporal_activity_schedule_to_start_latency.sum) value is extremely big while avg(temporal_activity_schedule_to_start_latency.count) is pretty small. I am pretty sure that we only triggered less than 100 activities and even if the unit is ms the latency is still strangely big. Also when I actually trigger those workflow I did’t feel a big latency and wondering if I am reading this metrics correctly?


here is my datadog screen shot

For SDK metrics schedule to start latencies you could use queries (Grafana):

sum by (namespace, task_queue) (rate(temporal_activity_schedule_to_start_latency_bucket[5m]))

sum by (namespace, task_queue) (rate(temporal_workflow_task_schedule_to_start_latency_bucket[5m]))

Can you also show poll task queue latencies:

sum(rate(service_latency_bucket{operation=~"PollActivityTaskQueue|PollWorkflowTaskQueue"}[5m])) by (operation, le)

what is the unit of temporal_activity_schedule_to_start_latency