Are there any guidelines on metrics to watch out for, for temporal services? We have a very high cpu usage on temporal ecs task and we are trying to understand the spike.
We are adding more cpu, however, we cant find any guidelines on how determine these(cpu, memory..) parameters.
Are there guidelines on what alerts/thresholds to set for health monitoring?
Currently we have alerts set to 80% of cpu utilization on the history service.
We appreciate any guidelines/directions on the topic.