CPU Usage Metrics

eabebe · April 1, 2026, 3:18pm

Are there any guidelines on metrics to watch out for, for temporal services? We have a very high cpu usage on temporal ecs task and we are trying to understand the spike.

We are adding more cpu, however, we cant find any guidelines on how determine these(cpu, memory..) parameters.

Are there guidelines on what alerts/thresholds to set for health monitoring?

Currently we have alerts set to 80% of cpu utilization on the history service.

We appreciate any guidelines/directions on the topic.

Thanks!

tihomir · April 9, 2026, 3:42pm

If you have server metrics start maybe with requests and errors rates:

sum by (operation) (rate(service_requests{service_name=“history”}[$rate]))

(same for service errors - service_errors)
and lets go from there

For memory:

avg(cache_usage{cache_type=“mutablestate”})
avg(cache_pinned_usage{cache_type=“mutablestate”})

also cache_requests:

sum(rate(cache_requests{cache_type=“events”,operation=“EventsCachePutEvent”}[1m]))
sum(rate(cache_requests{cache_type=“events”,operation=“EventsCacheGetFromStore”}[1m]))

see if there is correlation of these to your resource utilization

Topic		Replies	Views
Cpu_usage metrics Community Support	2	291	June 9, 2023
Baseline server resource consumption Community Support	1	22	February 16, 2026
Recommended metrics to use for autoscaling temporal server pods Community Support	1	530	May 4, 2023
History Service CPU usage Community Support	5	1049	March 26, 2021
Temporal database resource consumption Community Support	3	1191	April 6, 2023

CPU Usage Metrics

Related topics