Temporal database resource consumption

Our Postgres (v13) database used by Temporal server is regularly exceeding 90% utilization of CPU. It is running in GCP SQL and has 16 vCPU and 104GB memory allocated which seems like a lot. Our system handles about two million workflows evenly throughout each day which doesn’t seem excessive. We feel it is slowing down throughput as the number of pending workflows keeps climbing and goes down only when there is a reduction in input.

Are there any configuration tips or recommendations for our database to more efficient operate or is this a normal allocation for it. Note that all application data are stored in other databases.

What server version are you running?

If you have server metrics enabled can you check and share sync-match rate:

sum(rate(poll_success_sync{}[1m])) / sum(rate(poll_success{}[1m]))

We are running 1.13. We don’t have server metrics enabled.

@tihomir Is the server version the issue?

The Temporal Postgres database is using 16 CPUs and 104GB memory yet it near capacity.

In contrast, we have a separate Postgres database instance hosting 12 databases that uses 4 CPUs and 16 GB.