Baseline server resource consumption

I love that temporal is written in Go, 95% of my platform is in Go. I care a lot about resource consumption as I have a docker container deployment strategy running on lightweight Hetzner servers with a load balanced distributed mesh using Haproxy / Caddy and a framework my dev ops guy cooked up

My question is, since I’m not on anything that auto scales, can someone provide a baseline resource consumption scenario (CPU/RAM)? My framework includes a docker compose that auto provisions, so I will be dropping temporal in alongside friends like Prometheus and Grafana.

If possible, assuming a simplistic one-step approval flow, it would also help to know the resource usage (CPU/RAM) of a single worker thread when asleep versus awake so I might multiply that by my forecasted usage?

I think it will really depend on the throughput needed. Your Temporal primary db will play a big role here as throughput is basically measured by writes to db per second (state transitions / s), server metric:
sum(rate(state_transition_count_count[1m]))

thats same as persistence requests to create / update workflow executions:

sum(rate(persistence_requests{operation=~"CreateWorkflowExecution|UpdateWorkflowExecution"}[1m]))

As far as server resources goes, again it will be highly dependent on your use case but safe base-lines for “medium” to “large” type of workloads (so 30 / 60 sts) are typically

Frontend Service: 1.5 to 2 CPU Cores, 4 GiB Memory

History Service: 4 CPU Cores, 6+ GiB Memory (will depend on workloads, try keep < 70%)

Matching Service: 1 CPU Core, 2 GiB Memory (monitor cpu use)

Worker Service: 0.5 to 1 CPU Cores, 1 GiB Memory (monitor if using lots schedules)

Given your workload, you might need a lot less resources, but need to test.

Another important part on the server side is setting number of history shards (static config, persistence.numHistoryShards). It plays important role in reaching desired state transitions / s.
Typically recommended to set it to 2048 and watch your shard lock latency under load to see if spikes and their value, from server metrics:

histogram_quantile(0.99, sum by (le) (rate(semaphore_latency_bucket{operation="ShardInfo",service_name="history"}[1m])))

If you are running single history host, and dont have higher throughput requirements for use case you might set numHistoryShards lower maybe 512 or 1024 and watch your resource utilization especially on history host.
Temporal server includes set of useful dynamic configs to manage memory use of history service based on your max ram for example set
history.cacheSizeBasedLimitto true then can modify
history.cacheMaxSizeBytes, default 2mb

history.hostLevelCacheMaxSizeBytes, default 100mb

given resource you give for history host.

For your SDK workers, workers have in-memory cache of workflow executions. You control how many executions a single sdk worker caches via worker config
worker.SetStickyWorkflowCacheSize

go sdk default is 10K. as far as how much to set this to it will depend on memory footprint of your cached workflow exec go routine which highly depends on how much state you store in workflow variables for example. One way to measure that could be to start X numbers of executions with single workflow worker, keep them running and compare your memory utilization on worker before and after.
SDK metrics do give you visibility into worker cache size via sdk metric temporal_sticky_cache_sizeso you could get good idea this way on what your max cache size should be.

For CPU, look at worker configs worker.Options→MaxConcurrentWorkflowTaskExecutionSize and worker.Options→MaxConcurrentActivityExecutionSize (and MaxConcurrentLocalActivityExecutionSize if you are using local activities). defaults for go sdk for each are 1K. If your activities for example are cpu bound you might want to lower max concurrent activity execution size for example. Again can look at sdk metrics specifically
temporal_worker_task_slots_available per worker_typeto compare how many tasks worker is processing with your cpu and also memory utilization (especially for activities if they are memory bound)