250ms latency for a workflow with 2 empty activities

I did load testing of self-hosted temporal deployment
temporal version: 1.25
postgres17 m7g8xlarge single az
deployed on aws managed k8s with 12 pods for frontend, matching and history service each
Have configured HPA as well to scale horizontally and ensured it doesn’t hit max replicas
20 pods for worker with 20k max concurrent activity/workflow and 200 max poller for activity/workflow support

load testing, single workflow with 2 empty activities for 50rps, 100rps, 200rps, 300rps
On an average got 250ms latency for workflow completion and 80ms for workflow execute-call-to-schedule. Both of these metrics are custom and not from metrics emitted temporal.

  1. Is it good latencies considering workflow with 2 empty activities?
  2. Could it be improved by using temporal cloud? If yes, then what would be latency with temporal cloud?

Thanks :slight_smile: